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THESIS  ABSTRACT 


suitability  checker  and  investigate  favorable  characteristics  of 
assembly  language  loops  with  respect  to  parallel  processability.  -  > 


uggestions  are  made  for  further  development  of  a  parallel  task 
recognizer  for  assembly  language  programs  using  Ramamoorthy ' s 
connectivity  analysis  method. 

Design  considerations  are  outlined  for  development  of  an  assembler 
scheduled  multi-microprocessor  system.  The  machine  would  execute  source 
program  partitions  in  parallel  on  a  production  basis  a  large  number  of 
times.  This  would  be  possible  after  a  combined  assembly  and  schedule  of 
the  load  modules. 

Applications  envisioned  are  microprocessor  based  controllers  or 
instruments  that  would  achieve  increased  speed  at  less  cost  by 
performing  such  operations  as  input,  calculation,  retrieval,  and  output 
simultaneously.  Also,  economical  machines  could  be  designed  to  study 
aspects  of  parallel  processing  for  large  scale  computers  and  high  level 
languages. 
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I.  INTRODUCTION 


This  thesis  addresses  the  problem  of  how  to  better  exploit  the  lew 
cost  of  microprocessors  and  overcome  the  drawback  of  limited  speed 
capability.  The  method  investigated  involves  using  several  co-operating 
parallel  processors  for  faster  execution  of  a  single  program  that  will 
be  assembled  once  and  run  many  times.  This  would  spread  the 
optimization  cost  over  a  very  large  number  of  applications  [1]. 

Many  reasons  have  been  given  for  parallel  processing.  A  very 
significant  increase  in  system  throughput  is  theoretically  possible 
depending  on  the  system  and  the  application  [2-10].  .Almost  every 
computer  program  has  some  potential  for  parallel  processing,  because  the 
input/output  (I/O)  can  be  overlapped  with  the  function  performed  by  the 
computer  [11].  The  relation  of  parallel  processing  to  time  sharing  also 
has  been  discussed  as  justification  [12],  Speed  is  not  the  only  reason 
however. 

Memory  and  processors  can  be  used  more  effectively. 
Microprocessors  are  now  very  inexpensive,  and  using  more  processors 
better  utilizes  the  more  expensive  memory.  This  is  possible  in  a  wide 
range  of  applications.  Most  microprocessor  systems  are  interrupt  driven 
and  should  have  good  potential  for  parallel  processing  because  the 
interrupt  task  can  usually  be  done  concurrently  with  the  main  program. 
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While  improving  existing  systems,  the  benefits  can  be  compounded  by 
developing  design  guidelines  for  future  systems. 

Using  microprocessor  systems  would  be  a  cost  effective  means  for 
further  research  on  parallel  processing.  This  research  is  needed, 
because  many  such  systems  have  been  suggested,  but  few  attempts  have 
been  made  to  partition  the  application  programs  into  parallel 
processable  segments  [8,91.  Some  work  has  been  done  in  this  area  for 
use  with  large  scale  computers,  however. 

Solutions  to  the  problems  of  synchronizing  shared  resources  have 
been  found  by  Djkstra,  Knuth,  and  Coffman  [11].  Previous  research  on 
parallel  processability  and  task  partitioning  on  high  level  language 
programs  has  been  done  by  Bernstein  [12]  and  Ramamoorthy  [1,11,13-15]. 
How  does  this  previous  work  relate  to  microprocessors? 

Ramamoorthy 's  results  with  FORTRAN  programs  are  applicable  to  a 
variety  of  uses  on  a  limited  scale  [1,14].  The  source  program  must  be 
less  than  200  executable  statements.  It  is  executed  on  a  large  scale 
CDC  6600  computer,  and  the  parallel  processes  are  scheduled  dynamically 
during  execution.  Only  non-nested  DO  loops  are  allowed.  Little 
apparent  interest  or  exploitation  of  these  techniques  was  shown  between 
1971  and  1978,  5ut  in  the  past  18  months  there  has  been  increased 
commercial  interest  in  multi-microprocessor  systems  and  concurrent 
processing  [2-9].  However,  multi-microprocessor  software  and  systems 
have  not  developed  along  the  guidelines,  suggested  by  Ramamoorthy,  for 
large  computers. 

The  currently  developing  distributed  multi-microprocessor  systems 
(master/slaves)  do  not  fully  utilize  the  master  processor  and  are 
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usually  uniquely  specialized  systems  that  are  not  generally  applicable 
to  a  variety  of  uses.  Tney  are,  therefore,  sometimes  not  as  cost 
effective  as  possible.  This  is  parallelism  at  the  operating  system 
level  rather  than  at  the  application  program  level;  multiprogramming  vs. 
parallel  processing. 

There  is  an  untapped  potential  for  a  more  generalized  system  that 
makes  parallel  processing  almost  transparent  to  the  user  or 
microprocessor  system  designer.  The  following  chapters  will  discuss: 

1.  Factors  bearing  on  the  problem; 

2.  Partial  solution  -  suitability  checker; 

3.  Interpretation  of  results; 

4.  Suggestions  for  further  work. 

It  is  assumed  that  readers  have  a  rudimentary  knowledge  of  common  terms 
used  to  describe  an  assembler  program. 


II.  CONSIDERATIONS  AND  ASSUMPTIONS 


The  problem  of  implementing  an  assembler  scheduled  multi- 
microprocessor  system  may  be  broken  into  five  parts  [1,4].  First, 
appropriate  candidate  programs  must  be  found  by  using  a  suitability 
checker.  Secondly,  parallel  processable  portions  of  the  program  must  be 
identified  using  a  parallel  task  recognizer.  Thirdly,  synchronization 
primitives  must  be  added  for  interprocessor  communications  and 
scheduling.  Fourthly,  utilizing  the  parallel  task  recognizer  and 
scheduling  information,  the  program  must  be  loaded  into  memory  for 
parallel  processing.  Lastly,  the  memory  organization  and  system 
hardware  must  be  defined.  These  are  significant  problems  for  assembly 
language  source  programs  because  of  the  simplicity  and  fundamental 
nature  of  microprocessor  based  systems.  Therefore,  an  attempt  was  not 
made  to  solve  the  entire  problem.  This  thesis  deals  mainly  with  part 
one  and  portions  of  part  two  of  the  problem.  This  includes,  for 
assembly  language  source  programs,  making  a  suitability  determination 
and  finding  elements  of  Ramamoorthy ' s  reduced  or  loop  free  program  graph 
(LFPG)  [13,14].  The  LFPG  has  a  node  for  each  instruction  in  the  source 
program  except  for  the  case  of  loops.  For  loops,  all  instructions  or 
tasks  are  grouped  together  and  represented  as  a  single  node;  thus  the 
term,  loop  free.  Fig.  1  illustrates  examples  of  the  graphs  used  by 
Ramamoorthy ’  s  Parallel  Task  Recognizer.  It  should  be  emphasized  that 
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Analysis  of  Task  Transitions^ 


Original  Program  to  Add 
32  Bit  Numbers^ 


Task 

Source 

Code 

Function  of 

Action 

1 

.ADD  32 

LXI 

B,  3 

immed.  operand 

r  B  =  3 

2 

DAD 

B 

task  1 ,  r  H&L 

r  H&L  =  r  H&L+r  B&C 

3 

XCHG 

task  2,  r  D&E 

exchg.r  D&E,  r  H&L 

4 

DAD 

B 

tasks  1  &  3 

r  H&L  =  r  H&L+r  B&C 

5 

STC 

nothing 

set  carry 

6 

CMC 

task  5 

reset  carry 

7 

•  LOOP 

ID  AX 

D 

task  3 

load  addend  1 

8 

ADC 

M 

tasks  4,  6  &  ? 

add  acc.+r  H&L,  carry 

9 

STAX 

D 

task  8 

store  result 

10 

DCX 

D 

task  3 

r  D&E  =  r  D&E  -  1 

11 

DCX 

H 

task  4 

r  H&L  =  r  H&L  -  1 

12 

DCR 

C 

r  C 

decrement  loop  index 

13 

JP 

.LOOP 

task  12 

loop  if  positive 

14 

RETT 

task  13 

return 

Figure  1 

Graphs  Used  by  the  Parallel  Task  Recognizer 

^•This  subroutine  requires  that  register  pair  H&L  point  to  the  first 
byte  of  the  first  number.  Register  pair  D&E  points  to  the  first  byte  of 
the  second  number.  Register  G  is  set  to  two. 

^Transitions  exist  only  between  tasks  which  change  a  value  and  the 
next  task  which  uses  that  same  value.  It  is  not  really  necessary  to 
analyze  task  transitions  within  the  loop,  but  this  is  done  for  clarity 
and  completeness.  As  is  shown  in  the  Parallel  Processable  Task  Graph  on 
the  following  page,  a  transition  exists  from  task  1  to  task  2,  because 
task  2  uses  the  results  of  task  1 ,  etc . 


J 


Permissible  Transition 
or  Program  Graph3 


Loop  Free  or 
Reduced  Program  Graph11 


Parallel  Processable 
_ Task  Graph-5 _ 


Program  Task  Equivalent  Task  Task  Partition 


Figure  1  (continued) 

Graphs  Used  by  the  Parallel  Task  Recognizer 


''Each  node  represents  a  statement  in  the  program  shown  on  the 
previous  page.  Unless  an  element  is  a  branch,  its  successor  is  the  next 
node . 

Program  representation  in  which  all  elements  of  a  loop  are 
considered  as  a  single  task  or  a  node  on  the  graph. 


^Time  ordering  exists  between  nodes  connected  by  arrows.  Partitions 
are  identified  by  using  Ramamoorthy's  Matrix  Method  of  Precedence 
Partitions  C141 .  Partitions  would  be  executed  by  two  processors.  Tasks 
in  the  same  partition  are  executed  concurrently.  The  numbers  to  be  added 
and  the  result  would  be  stored  in  common  memory.  The  subroutine  could 
run  up  to  19  percent  faster,  i.e.,  16  tasks  vs.  21  tasks  for  the  complete 
process.  Exactly  how  much  faster  depends  on  the  overhead  of 
initialization  and  transfer  of  information  between  the  two  processors. 
If  the  routine  were  being  called  from  a  loop,  the  benefits  would 
increase  with  each  iteration  of  the  loop. 
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Ramamoorthy ' s  work  was  based  on  high  level  language  source  programs. 
Here,  the  source  programs  are  in  microprocessor  assembly  language  which 
significantly  complicates  the  overall  task.  This  chapter  first 
discusses  the  problems  of  implementing  a  suitability  checker  by  checking 
for  favorable  characteristics  in  pass  one  of  the  assembler.  Second,  the 
differences  in  working  with  FORTRAN  and  assembly  language  are  addressed, 
because  Ramamoorthy 's  work  dealt  only  with  FORTRAN  programs. 


Bernstein  stated  the  conditions  necessary  for  parallel  processing 
and  why  program  suitability  is  programmer  dependent  when  dealing  with 
implicit  (vs.  explicit)  parallelism  [12].  The  same  task  or  algorithm 
could  be  coded  more  or  less  favorably  depending  on  the  programmer's 
style  or  sequence.  Based  partially  on  Bernstein's  work,  Ramamoorthy 
devised  a  hueristic  formula  for  determining  suitability  for  parallel 
processing  [ 1 ] . 

The  formula  has  nine  variables  used  with  FORTRAN  in  calculating  a 
suitability  factor,  SF: 


where:  MR  =  READS  or  input 

NA  =  Arithmetic  statements 
NP  =  PRINTS  or  output 
NC  =  CALLS 

ND  s  DO  loops  (loops  of  known  boundaries  and 
iterations) 

MI  =  IF's  or  conditional  branches 
NG  =  GO  TO's  or  unconditional  branches 
LP  =  Total  executable  statements 
LD  =  Total  statements  in  DO  loops 
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This  formula  was  based  on  research  gathered  after  using  Rsmamoorthy ' s 
parallel  task  recognizer  to  analyze  FORTRAN  programs.  It  shows  as  plus 
factors,  those  tasks  that  could  be  done  simultaneously  given  that 
Bernstein's  conditions  were  satisfied.  The  negative  factors  represent 
conditions  that  delay  scheduling  decisions  until  execution  time,  i.e. 
conditional  branches  and  unconditional  forward  branches.  These  state¬ 
ments  can  generate  intricate  paths  which  complicate  prediction  of 
process  flow  [12].  The  factors  LP  and  LD  reflect  the  fact  that  DO  loops 
enhance  possibility  for  parallel  processing  provided  the  loops  are  not 
too  long  with  respect  to  other  partitions  (tasks  that  can  be  executed  as 
a  block).  Obviously,  if  the  loops  are  very  long,  other  partitions  would 
be  executed  before  the  loop  finished.  One  processor  would  have  to  wait 
so  long  for  the  loop  to  finish,  that  benefits  of  parallel  processing 
would  be  lost.  It  would  be  better  in  such  a  case  to  process 
sequentially  and  not  incur  the  overhead  of  establishing  parallel 
processes,  because  the  overhead  might  cancel  any  benefits  of  parallel 
execution. 

If  this  works  for  FORTRAN  it  would  seem  to  be  applicable  for  any 
language.  But  it  is  significant  to  note  that  although  the  algorithm  for 
a  suitability  checker  or  parallel  task  recognizer  is  language 
independent,  its  implementation  is  obviously  dependent  upon  the  level 
and  structure  of  the  source  language  to  a  great  extent.  Thus  a 
completely  universal  application  is  not  possible  [11,14].  If  further 
work  is  necessary  to  implement  the  recognizer  on  another  language,  why 
choose  assembly  language? 


i 


9 


Advantages  of  Assembly  Language 
Suitability  Checker 

There  are  many  significant  reasons  to  analyze  assembly  language  for 
parallel  processing.  First,  there  is  a  large  potential  to  improve  many 
existing  assembly  language  programs,  since  they  are  usually  more  memory 
efficient  than  high  level  language  programs  [33.  Second,  this  research 
could  indicate  the  need  for  high  level  languages  such  as  PASCAL  that 
allow  explicit  indications  of  parallelism  to  be  used  on  microprocessors 
[8,11].  Third,  we  can  identify  and  standardize  the  most  effective 
parallel  constructs  as  desirable  explicit  capabilities  of  high  level 
microprocessor  languages  [93.  Fourth,  utilizing  implicit  parallelisms 
in  existing  assembly  language  does  not  require  the  programmer  to  learn  a 
new  language.  Fifth,  studying  parallel  processing  with  small  computer 
systems  would  be  cost  effective.  And  when  the  results  are  better 
understood,  the  techniques  may  be  applied  to  more  sophisticated  systems 
for  further  benefits.  How  then  can  this  analysis  be  applied  to  assembly 
language? 

Two  main  differences  exist  between  FORTRAN  and  assembly  language 
with  respect  to  implementing  the  suitability  checker  and  parallel  task 
recognizer.  One,  INTEL  8080  language  has  no  explicit  constructs  for 
loops  of  known  iteration  or  length  that  can  be  easily  determined  by 
scanning  a  single  line  of  the  source  program.  But  loops  in  assembly 
language  can  be  compared  to  certain  characteristics  of  FORTRAN-like 
loops,  to  find  the  loops  for  which  the  analysis  applies.  TVo, 
determining  the  task  transitions  may  not  be  as  easy  in  assembly  language 
as  in  FORTRAN  [1,123.  In  the  parallel  task  recognizer  the  parallel 
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processable  task  graph  requires  that  a  determination  be  made  when  the 
output  or  result  of  one  task  is  used  by  or  input  to  another  task.  This 
is  fairly  easy  in  FORTRAN,  because  memory  locations  are  stated 
specifically  in  the  instructions  [12].  It  is  not  so  obvious  in  assembly 
language,  because  many  transitions  depend  on  more  subtle  conditions  such 
as  flags  set  or  interrupts  enabled.  Although  these  can  be  determined  it 
is  not  always  an  easy  matter  of  scanning.  The  recognizer  will  require 
more  sophistication.  But  there  are  other  considerations  besides 
language. 

ViTiy  use  one  time  preload  scheduling  instead  of  dynamic  scheduling 
as  is  done  with  some  high  level  languages?  It  is  because  a 
microprocessor  is  usually  driving  a  dedicated  system,  no  matter  hew 
general  its  structure  may  be.  It  does  not  need  the  flexibility  provided 
by  a  dynamic  scheduler  nor  the  associated  overhead.  Ey  using  one  time 
scheduling,  optimization  cost  is  spread  over  thousands  or  millions  of 
program  executions  [1].  According  to  the  author's  preliminary 
investigation  with  the  NATIONAL  SEMICONDUCTOR  microprocessor  family,  the 
processor  speed  does  not  permit  dynamic  scheduling  at  the  user  program 
level  that  achieves  any  meaningful  benefit  if  it  is  indeed  possible. 
This  conclusion  is  supported  by  others  [8]. 

Dynamic  scheduling  with  one  processor  acting  as  an  interpretation 
or  scheduling  unit  for  other  execution  units  and  transfer  units  has  been 
suggested  by  Lorin  [10].  Because  non-microprogramable  processors  are 
too  slew  to  function  as  the  scheduling  unit,  this  function  was 
relegated  to  the  assembler  for  a  one  time  schedule  at  load  time.  But 
even  this  method  requires  much  analysis  time.  Therefore  it  is  very 
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desirable  to  assure  some  hope  of  success.  This  is  the  reason  for  the 
suitability  checker  [1]. 

The  suitability  checker  is  important,  because  the  parallel  task 
recognizer  requires  large  arrays  with  the  number  of  elements  equal  to 
the  square  of  the  nunber  of  executable  statements  in  the  source  program. 
This  not  only  requires  a  large  amount  of  memory,  but  requires  more 
execution  time  to  manipulate  the  arrays.  The  suitability  checker, 
however,  requires  significantly  smaller  arrays.  The  largest  is  only 
four  times  the  number  of  executable  statements.  Also  there  are  two 
reasons  why  the  parallel  task  recognizer  alone  is  not  sufficient  for 
analyzing  assembly  language.  First,  it  does  not  reveal  anything  about 
loop  structure  that  would  aid  subsequent  checking  for  loop  iterations. 
Secondly,  it  requires  more  array  manipulation  for  assembly  language  to 
find  loops,  because  they  are  not  explicit  as  in  FORTRAN.  But,  all 
information  required  is  available  in  assembler  pass  one,  so  it  can  make 
a  determination  before  using  the  parallel  task  recognizer.  The  results 
could  be  indicated  and  decisions  requested  interactively  or  made  based 
on  a  predetermined  value. 

Assumptions  for  Assembly  Language 
Suitability  Checker 

Two  main  assumptions  were  made  to  implement  the  suitability  checker 
with  respect  to  INTEL  8080  assembly  language.  These  assumptions  involve 
determining  instruction  types  and  recognizing  loops.  The  INTEL  8080  was 
chosen  because  of  experience  with  it,  and  because  it  is  one  of  the  most 
widely  used  microprocessors.  These  assumptions  should  apply  to  most  any 
microprocessor  language  for  recognizing  instruction  types  and 


recognizing  loops.  Recognizing  instructions  is  obviously  machine 
language  dependent.  The  decision  to  use  op-codes  or  mnemonics  to 
recognize  instructions  depends  on  how  op-codes  are  grouped  and  how  well 
mnemonics  relate  to  a  class  of  instructions  based  on  the  variables  in 
Ramamoorthy ' s  suitability  formula.  The  easiest  method  should  be  used 
for  the  machine  in  question.  For  the  INTEL  8080,  checking  op-codes 
works  very  well  to  recognize  instruction  types.  See  Appendix  D.  But  we 
must  also  be  able  to  recognize  loops. 

Assume  for  assembly  language  that  any  backward  jump  is  a  loop. 
Although  this  is  not  the  same  as  a  DO  loop,  assume  that  these  loops  are 
for  a  known  number  of  iterations.  If  necessary,  some  form  of  check  can 
be  done  later  to  determine  which  loops  are  for  indefinite  iterations. 

All  types  of  FORTRAN-like  loops  can  be  described  in  assembly 
language  with  three  basic  constructs  shown  in  Fig.  2.  Tnree  loop 
classes  are  necessary  because  loops  are  classified  structurally  by  their 
entry  and  exit.  For  a  simple  loop  the  conditional  jump  provides  entry 
and  exit.  For  an  intermediate  or  complex  loop  the  unconditional  jump 
back  always  provides  a  possible  entry.  The  conditional  jumps  around  the 
jump  back  provide  a  possible  exit.  If  either  entry  or  exit  is  not  pos¬ 
sible,  there  is  probably  a  logical  error.  Why  is  it  necessary  to  check 
for  structure?  First,  because  overlapped  loops  are  not  allowed  in  this 
analysis.  Secondly,  because  analyzing  structure  makes  some  forms  of 
error  detection  possible  by  determining  if  the  minimum  structure  is  not 
present.  Thirdly,  because  it  will  allow  later  analysis  of  factors 
affecting  iterations  for  scheduling  considerations.  After  analyzing  the 
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L> 


CLASS  1  CLASS  2  CLASS  3 

SIMPLE  INTERMEDIATE  COMPLEX 


CONDITIONAL  JUMP 


UNCONDITIONAL  JUMP 


All  loops  may  be  represented  as  one  of  these  three  classes. 
The  class  determination  factors  are  the  entry  and  exit  to 
the  loop.  Other  jumps  may  be  present  but  are  not  required. 
If  the  minimum  conditions  are  not  satisfied,  there  is  prob¬ 
ably  a  logical  error. 

n 


OVERLAPPED 


Overlapped  loops  are  not  allowed,  because  they  cannot  be 
considered  as  either  a  single  task  or  two  discrete  tasks 
by  the  suitability  checker  or  the  parallel  task  recognizer. 
They  will  be  noted  by  the  suitability  checker.  The  over¬ 
lapped  loops  would  cause  the  results  to  be  invalid,  unless 
the  overlapped  loops  are  nested  inside  a  third  loop.  In 
this  case  they  would  make  no  difference,  because  all  tasks 
included  in  the  outer  loop  would  be  considered  a  single  task. 


Figure  2 

FORTRAN  -  Like  Loops  in  Assembly  Language 
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loops  with  respect  to  structure,  what  is  the  significance  of  other 
jumps? 

There  are  three  other  situations  wherein  jumps  are  related  to  the 
loops:  jumps  out,  jumps  around,  and  internal  jumps.  Jumps  out  of  the 
loop  in  excess  of  minimum  requirements  are  associated  with  the  loop. 
Possible  transitions  are  noted,  but  the  extra  jumps  are  not  significant 
to  the  structural  classification.  Jumps  around  the  loop  are  not 
associated  with  the  loop,  because  they  eliminate  it  as  a  task,  but  the 
fact  is  noted  and  analyzed  for  possible  errors.  Other  internal  jumps  in 
excess  of  minimum  requirements  are  associated  with  the  loop,  because  the 
recognizer  will  treat  the  whole  loop  as  a  single  task.  Also,  each  loop 
could  be  further  analyzed  as  a  subprogram  to  check  possibility  for 
parallel  processing  within  the  loop.  See  Fig.  3. 

Based  on  these  assumptions  it  is  possible  to  define  a  loop,  what 
loop  classes  are  present  as  shown  in  Fig.  2,  and  haw  other  jumps  relate 
to  the  loops.  Basic  rules  are  established  for  checking  type  and  number 
of  instructions,  recognizing  loops,  and  deciding  how  they  affect  the 
subject  program  in  terms  of  its  suitability  for  parallel  processing. 
The  next  chapter  discusses  the  suitability  checker  algorithm. 


Example  of  a  loop  as  a  single 
task  to  find  G . 


G  =  A  x  D 

L 


A  =  B  +  G 


D  =  E  +  F 


Figure  3 

Parallel  Processing  Within  a  Loop 


III.  ALGORITHM  FOR  SUITABILITY  CHECKER 


The  assembler  suitability  checker  is  based  on  Ramamoorthy ' s 
research  discussed  in  the  previous  chapter.  The  suitability  checker 
presented  herein  has  been  adapted  for  use  with  assembly  language.  These 
changes  to  the  cross  assembler  are  shown  in  Fig.  4.  During  assembler 
pass  one  it  scans  the  instruction's  op-code  to  count  different  types  of 
instructions  for  the  suitability  factor,  SF.  If  a  jump  is  found,  it 
checks  to  determine  the  type  of  jump  and  build  a  jump  table.  At  the 
end  of  pass  one,  the  loop  analysis  process  requires  scanning  from  the 
end  of  the  source  program  to  the  beginning  to  see  if  minimal 
requirements  for  loops  are  met,  check  for  possible  errors,  calculate  the 
SF,  and  find  the  nodes  of  the  loop  free  program  graph  (LFPG).  The 
output  is  shown  in  the  short  example  listing  in  Fig.  5. 

The  loop  analysis  is  the  most  lengthy  and  complex  portion  of  the 
algorithm.  The  source  code  is  checked  backwards,  because  it  is  possible 
to  work  back  from  the  jump  and  determine  what  is  included  in  the  loop. 
Multiple  jumps  cannot  start  at  the  same  point,  but  they  can  terminate  at 
the  same  point.  The  objective  is  to  associate  as  many  jumps  as  possible 
with  loops.  Others  not  associated  must  be  isolated  and  are,  therefore, 
negative  factors  in  the  suitability  equation.  This  is  so,  because  every 
statement  contained  in  a  loop  is  treated  as  a  single  task  by 
Ramamoorthy • s  parallel  task  recognizer. 
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INSTRUCTION  TYPES 
FOR  S  F  CALCULATION 


|  run  j  r 


THESE  DO  STATEMENTS 
REFER  TO  THE  PROGRAM 
IN  APPENDIX  3. 


CHECK  LOOP  CONSTRUCTS 
(DO  590)  • 

CHECK  3ACK  JUMPS 
(DO  574) 

CHECK  FORWARD  JUMPS 
_ (D 0  577 ’) _ 


FIND  NESTED  LOOPS 


JUMP  RELATIONSHIPS 
ARE  DETERMINED  3Y 
COMPARING  EXECUTABLE 
INSTRUCTION  NUMBERS . 
OBJECTIVE  IS  TO 
ASSOCIATE  AS  MANY 
JUMPS  AS  POSSIBLE 
WITH  LOOPS. 


PASS  TWO 
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S  F  CALCULATION 
f  OUTPUT  JUMP  ANALYSIS  j* 


INITIALIZE 
FOR 
LFPG 


1 

I 


L 


S  R  FINDLP 


I - 1 

I  SR  LSI  CUT  i 


^  *1  /  LFPG  /  1 

I  l  hrii 


-  EXISTING  PROGRAM  -  ADDED  PROGRAM 

Figure  4 

Modifications  to  the  Assembler  Program 


Figure  5 


This  chapter  explains  how  the  suitability  checker  is  incorporated 
into  the  assembler.  Although  it  is  possible  to  quickly  find  jumps  and 
other  types  of  instructions  to  count  for  the  SF  calculation,  more 
analysis  is  required  to  determine  nested  loops,  jumps  not  associated 
with  loops,  and  number  of  instructions  within  loops.  This  is  the 
purpose  of  the  loop  analysis.  After  finding  the  types  of  instructions, 
all  of  the  loops  must  be  checked  to  determine  their  construction.  This 
is  a  means  of  determining  the  relationships  of  all  jumps  according  to 
the  three  loop  constructs  defined  in  the  previous  chapter.  Each  jump  is 
first  checked  to  see  if  it  is  already  associated.  If  not,  it  is  cheeked 
to  find  if  it  is  forward  or  back.  If  back  (a  loop),  all  forward  jumps 
are  checked  against  it  to  see  if  any  forward  jumps  go  around  the  loop, 
jump  out  of  the  loop,  or  jump  into  the  loop.  This  information  is  used 
to  determine  the  loop  class  or  any  errors.  On  the  other  hand,  if  the 
jump  being  checked  is  forward,  checks  of  all  forward  jumps  and  each 
subsequent  loop  are  made  to  see  if  it  jumped  around  any  loops,  if  it 
jumped  around  any  other  forward  jumps,  or  if  any  forward  jumps  jumped 
around  it.  In  this  way,  the  relationship  of  all  jumps  is  determined  and 
analyzed  for  errors.  Then  the  identifications  for  nested  loops, 
unassociated  forward  jumps,  and  instructions  within  loops  can  be  made. 

The  remainder  of  this  chapter  discusses  in  detail  the  algorithm 
necessary  to  modify  the  main  assembler  program,  modify  the  source  list, 
and  add  one  subroutine  to  find  loop  numbers.  This  includes  discussion 
of  common  areas  and  variables,  instruction  scan,  loop  structure  check, 
nested  loop  check,  SF  calculation,  jump  analysis,  shewing  the  nodes  of 
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LFPG,  and  subroutine  FINDLP.  See  Appendix  B  for  the  assembler 
modification  instruction  listing. 

Common  Areas  and  Variables 

TVfo  named  commons,  LOOP  and  INST,  were  added  to  the  main  program. 
The  array  sizes  chosen  have  proved  workable,  but  could  be  increased  if 
desired.  Common  LOOP  contains  information  about  loops  found  in  the 

source  program.  It  includes  four  variables.  LPMAX  is  the  total  number 
of  loops.  Array  LP  (100)  is  the  loop  table  which  contains  the  executable 
instruction  number  of  each  loop's  jump  instruction.  LPNO,  loop  number, 

is  the  pointer  to  the  loop  array,  LP  (100).  LODES  is  a  pointer  to  the 

loop  table  entry  whose  destination  is  currently  the  lowest  during  the 
nest  check.  The  other  common,  INST,  contains  information  about 
instructions.  It  has  five  variables.  The  array  JPX  (1000,2),  for 

jump/execution  number,  holds  the  jump  number  and  executable  instruction 
number  for  each  line  of  source  code.  The  array  JP  (200, 4),  for  jump 
table,  holds  four  data  for  each  jump.  That  is:  1)  the  executable 

instruction  number  of  the  jump,  2)  the  executable  instruction  number  of 
the  destination,  3)  the  type  of  jump  or  class  of  loop,  and  *0  the 
pointer  to  the  associated  loop.  MSTOPN,  for  stop  number,  and  "ST0PD, 
for  stop  dot,  are  used  in  printing  the  source  list  to  show  the 
destination  and  jump  for  each  outside  loop.  LFPG  is  the  node  number  of 
the  loop  free  program  graph.  The  array  JPDES  (200),  for  jump 


destination,  is  used  only  by  the  main  program  as  a  table  of  jimp 
destinations.  It  is  not  in  a  common.  It  holds  the  symbol  table  pointer 
for  each  junp's  destination.  This  is  required,  because  the  executable 
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instruction  number  of  the  forward  jump's  destination  is  not  known  until 
the  end  of  pass  one.  For  a  list  of  all  variables,  see  Fig.  6. 

Counters,  and  Qth.gr  Variables 

The  suitability  factor,  SF,  requires  seven  counters: 

SF  =  ( ( NIQt-NAL+MC+NB-MF)  / ( NX-NL) )  -  (ML/ NX) 

NIO  is  the  number  of  input  or  output  instructions  (equivalent  to  FORTRAN 
reads  and  writes).  NAL  is  the  number  of  arithmetic  and  logical 
instructions.  NC  is  the  number  of  unconditional  calls.  MB  is  the 
number  of  loops  (backward  jumps).  NF  is  the  number  of  forward  jumps  not 
associated  with  a  loop  plus  the  number  of  conditional  calls.  MX  is  the 
total  number  of  executable  instructions.  ML  is  the  total  number  of 
instructions  contained  in  outside  loops.  Note  that  this  includes  all 
instructions  in  loops,  but  instructions  in  nested  loops  are  not  counted 
more  than  once.  By  comparing  these  seven  variables  to  Ramamcorthy ' s 
formula  on  page  7,  the  relationships  may  be  noted.  Six  additional 
variables  are  required  to  modify  the  program. 

NJ  is  the  jump  table  pointer  to  array  JP  (200,4).  NJMAX  is  the 
total  number  of  jumps.  KPTR  is  an  additional  symbol  table  pointer  that 
can  be  used  to  determine  a  label  location  without  disturbing  the 
original  symbol  table  pointer,  SMBPTR.  K,  the  instruction  type,  is  part 
of  the  original  assembler.  It  is  used  with  the  original  ICODE,  the 
instruction  op-code,  to  determine  the  variables  for  SF.  See  Appendix  D 
for  explanation  of  op-code  groupings.  KDEST  is  used  as  the  address  of  a 
jump  destination,  so  a  label  location  can  be  noted  without  disturbing 
the  original  symbol  table  pointer.  The  next  section  explains  how  all 
these  variables  are  used  in  the  analysis. 
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INITIAL 
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LOOP  TABLE 
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CURRENT 
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S. 

ORIGINAL  SYMBOL  TABLE  POINTER 

T.  POINTER 
0 
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TABLE 

PASS  1 

KDEST 
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CURRENT 

PASS  1 

DESTINATION 

Figure  6 


Variable  Initialization 
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Instruction  Scan 

The  original  assembler  program  is  altered  so  that  the  scanner 
(program  label  500)  checks  each  source  line  after  the  assembler  finishes 
and  before  the  next  source  line  is  read.  See  Fig.  7.  The  scanner 
checks  the  op-code,  ICODE,  which  is  already  available  to  determine  the 
instruction  type  for  SF  calculation.  For  the  INTEL  80  80  language  it.  was 
found  convenient  to  check  op-codes  because  of  the  way  they  are  grouped. 
See  Appendix  D.  The  type  of  instruction  can  be  determined  by  checking 
within  a  range  of  op-codes.  If  it  is  a  jump,  note  jump  type  as  follows: 

1.  Type  1  is  conditional  backward; 

2.  Type  2  is  unconditional  backward; 

3.  Type  6  is  conditional  forward; 

4.  Type  8  is  unconditional  forward. 

Forward  jumps  are  distinguished  from  backward  jumps  by  the  fact  that 
only  backward  jumps  have  a  destination  defined  prior  to  their  encounter 
by  the  scanner.  This  is  determined  by  checking  the  symbol  table.  If 
the  destination  is  undefined,  either  the  jump  is  forward  or  there  is  an 
error.  Based  on  this,  construct  the  initial  jump  table  in  array  JP. 
This  includes  the  jimp's  execution  nimber,  destination  if  backward  or 
KDEST  if  forward,  and  jimp  type.  Backward  jumps  only  (loops)  are 
associated  with  themselves.  Continue  by  counting  executable 

instructions,  but  comments  and  pseudo  op's  are  not  counted.  Then  return 
to  the  assembler  to  read  the  next  source  line  at  program  label  1 .  At 
the  end  of  pass  one,  signified  by  the  END  pseudo  op,  find  the  executable 
instruction  number  of  the  destination  for  all  forward  jumps.  See  Fig. 


FOUR  ENTRY  POINTS  TO  INSTRUCTION 
CHECK  FROM  THE  ASSEMBLER,  WHERE 

IT  FOUND  EXECUTABLE  INSTRUCTIONS. 


Figure  7 

Flowchart  for  Instruction  Scanner 


8.  Prior  to  pass  two,  check  loop  constructs,  find  nested  loops,  and  do 
the  jump  analysis. 


Checking  Loop  Constructs 

This  discussion  is  supplemented  with  flowcharts  throughout  the 
following  pages.  Also  the  reader  may  wish  to  refer  to  Appendix  S.  All 
program  labels  and  format  statements  refer  to  the  program  listing  of 
assembler  modifications  in  Appendix  B. 

Loop  constructs  must  be  checked  to  classify  loops  and  find  any 
structural  errors  as  well  as  to  associate  all  possible  jumps.  This 
process  is  called  DO  590  in  the  program.  See  Fig.  9.  Tne  index  is  M C. 
Note  that  since  it  is  desirable  to  check  backwards,  the  index  is 
manipulated  to  do  this  and  the  reverse  index  is  MJ.  First,  in  the  jump 
table,  check  if  the  jump  is  already  associated.  If  it  is,  go  to  590, 
because  that  means  it  has  already  been  checked.  If  not,  branch 
depending  on  whether  the  jump  is  forward  or  back. 

If  the  jump  checked  by  DO  590  is  backward  signifying  a  loop,  go  to 
label  562,  the  start  of  DO  loop  574.  This  is  shown  in  Fig.  9.  .All 
forward  jumps  are  checked  against  the  loop  for  possible  association  with 
it.  To  be  associated,  a  jump  must  start  in  the  loop.  If  it  does  start 
in  the  loop,  there  are  two  possibilities.  If  the  jump  stays  in  the 
loop,  it  is  not  significant  to  the  structural  classification  unless  it 
jumps  around  another  unconditional  jump  out.  If  a  jump  does  exit  the 
loop,  it  is  associated,  and  the  loop  is  checked  for  entry  and  exit. 
This  is  done  by  checking  all  previous  jumps  out  to  make  sure  one  is  not 
unconditional.  This  could  preclude  entry  to  the  loop.  If  such  an 
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Figure  8 

Flowchart  for  Finding  Forward  Jump  Destinations 
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Flowchart  for  Checking  Loop  Constructs 


Flowchart  for  Checking  Loop  Constructs 
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unconditional  jump  out  is  found,  a  further  search  is  made  for  a 
conditional  forward  jump  around  it.  This  complex  construct  identifies  a 
Class  3  loop  as  shown  on  the  second  page  of  Fig.  9  and  in  Fig.  2. 

Each  exit  is  also  checked  to  see  if  it  is  already  associated.  If 
it  is,  this  implies  the  presence  of  nested  or  overlapped  loops,  because 
one  jump  exits  at  least  two  loops  from  the  same  point.  Multiple 
associations  are  noted,  because  the  jump  table  permits  only  one 
association  to  be  stored  for  each  jump.  The  loop  class  is  also  stored 
in  the  jump  table.  Entry  and  exit  errors,  if  any,  are  noted  before 
checking  the  next  jump  in  DO  loop  590.  This  process  of  checking  loops 
is  shewn  in  the  first  three  pages  of  Fig.  9. 

The  last  two  pages  of  Fig.  9  show  the  process  for  checking  forward 
jumps  identified  by  DO  loop  590.  For  this  case,  all  subsequent  loops 
and  previous  forward  jumps  must  be  investigated  to  determine 
relationships  with  the  forward  jump.  If  the  forward  jump  gees  around  a 
loop,  a  negative  association  is  made  with  the  loop  to  show  this  ir.  the 
output.  Such  a  forward  jump  is  still  counted  as  unassociated  ir.  the  SF, 
if  it  is  not  part  of  any  other  loop.  If  the  forward  jump  gees  around 
another  jump,  there  are  two  possibilities.  If  the  jump  around  is 
conditional,  it  is  simply  noted.  But  if  it  is  unconditional,  it  is 
noted  as  a  possible  error,  because  it  could  preclude  the  other  jump  from 
being  executed.  Tnis  is  a  useless  situation  and  a  logical  error.  If, 
at  the  end  of  DO  loop  577,  the  forward  jump  is  not  associated  at  all,  it 
is  associated  with  itself  to  show  that  it  is  an  isolated  forward  jump. 
This  is  the  end  of  DO  loop  590,  the  loop  construct  check.  Using  the 
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Figure  9  (continued) 

Flowchart  for  Checking  Loop  Constructs 


information  gained  here,  it  is  now  possible  to  make  a  positive  check  for 
nested  loops. 


Nest  Check 

The  purpose  of  the  nest  check  is  to  find  nested  loops  and 
overlapped  loops.  See  Fig.  10.  This  is  done  by  using  the  loop  table  as 
a  pointer  to  check  through  the  jump  table  backwards.  Each  loop  is 
checked  against  the  previous  loop  and  the  outermost  loop.  The  objective 
is  to  see  if  the  range  of  executable  instruction  numbers  for  the 
previous  loop  is  a  subset  (nested)  of  the  others  or  an  intersecting  set 
(overlapped)  of  the  others.  This  checking  process  is  done  in  a  loop 
called  DO  588  as  shown  in  Fig.  11.  KC  is  the  loop  index  which  points  to 
the  loop  currently  being  checked.  LODES  points  to  the  loop  having  the 
lowest  numerical  destination  of  those  already  checked  (the  outer  loop). 
To  start  the  check,  the  last  loop  is  designated  the  outer  one,  because 
there  can  be  no  Subsequent  loop  to  contain  it.  During  the  check  there 
are  four  possible  situations.  The  previous  loop  may  not  be  nested.  The 
previous  loop  may  be  overlapped  with  the  current  loop.  Tne  previous 
loop  may  be  nested  in  the  current  loop.  Or,  the  previous  Iccp  may  be 
nested  in  the  outer  loop,  LODES.  For  each  of  these  situations,  a 
message  is  shown  except  for  the  case  when  a  loop  is  not  nested.  If  the 
nests  were  checked  completely  to  the  inner  loop  for  each  iteration  of  DC 
loop  588,  the  process  could  become  very  complex.  To  avoid  this,  the 
current  loop,  KC,  is  checked  only  against  the  previous  loop  and  the 
outermost  one.  This  requires  checking  each  current  loop  to  see  if  it  is 
the  new  LODES,  but  this  is  much  simpler  than  trying  to  track  a  nest  down 
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Figure  11  (continued) 


Flowchart  for  Nest  Check 
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to  its  innermost  loop  in  a  single  iteration  of  DO  loop  538.  Outside 
loops,  regardless  of  whether  they  contain  nested  loops,  are  flagged  for 
later  use  with  the  LFPG.  This  concludes  the  discussion  of  the  nest 
check. 

Final  Checks 

Three  final  checks  are  necessary  to  calculate  SF.  Check  the  final 
jump  table  first  for  forward  jumps  with  no  positive  association. 
Increment  NF  for  each  of  these  isolated  forward  jumps.  Next  find  NL, 
the  total  number  of  instructions  in  locps,  by  subtracting  the  executable 
instruction  number  of  the  destination  from  that  of  the  jump.  This 
result  for  each  outside  loop  is  added  to  find  NL.  This  is  not  done  for 
nested  loops.  Otherwise,  NL  could  be  larger  than  NX,  the  total  number 
of  instructions.  This  would  cause  SF  to  be  negative.  Also,  N6  must  be 
discounted  for  each  nested  loop  to  show  the  number  of  outside  loops 
only.  Prior  to  the  SF  calculation,  check  if  NX  =  NL.  In  this  case, 
show  the  unsuitability  for  parallel  processing  (format  9010). 
Otherwise,  SF  is  calculated  and  output  with  the  jump  table.  These  three 
checks  are  shown  in  Fig.  12. 


The  jump  analysis  checks  only  forward  jumps  to  find  these 
unconditional  ones  which  circumvent  other  instructions.  This  is  done  by 
trying  to  find  another  forward  jump  to  the  next  instruction.  If  such  a 
path  cannot  be  found,  a  search  is  made  for  a  loop  back  to  that  next 
instruction  after  the  unconditional  forward  jump.  If  no  path  is  found, 
a  possible  error  message  (format  9493)  is  shown.  Note  that  in  programs 
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Figure  12 

Flowchart  for  Final  Checks  and  Jump  Analysis 


Flowchart  for  Final  Checks  and  Jump  Analysis 


which  have  vectoring  (jump  to  an  address  that  jumps  to  another  address 
or  has  its  own  return  mechanism) ,  this  will  cause  erroneous  error 
messages,  because  the  return  is  obscure.  At  label  5?6L  note  jumps  not 


associated  with  any  loop.  At  label  597,  the  end  of  the  jump  analysis, 
write  the  headers  for  listing  the  loops.  Then  find  and  list  each  locp 
with  its  associated  forward  jumps,  if  any.  If  there  are  none,  it  will 
so  indicate.  This  is  the  end  of  the  jump  analysis  as  shown  in  Fig.  12 
on  the  previous  page. 


Loop  Free  Program  Graph 

The  key  to  the  Loop  Free  Program  Graph  (LFPG)  is  the  locp  table 
which  has  the  outer  loops  tagged  as  negative  by  the  nest  check.  This  is 
used  in  the  present  version  to  print  a  notation  on  the  source  list 
showing  the  node  numbers  of  the  LFPG  and  non  nodes  as  dots  in  a  column 
between  the  jump  number  and  the  executable  instruction  number.  This  was 
shown  in  Fig.  5.  This  is  done  by  subroutine  LSTCUT  at  the  same  time  it 
is  making  the  source  list.  This  capability  is  initialized  by  the  main 
assembler  program  by  setting  MSTOPN  to  the  number  of  the  last  loop  in 
the  jump  table.  MSTOPN  is  the  flag  that  suppresses  the  LFPG  rode 
numbers  within  outside  loops  on  the  listing.  When  subroutine  LSTCUT  is 
called,  the  assembler  pass  two  has  been  modified  to  determine  if  there 
are  any  loops.  If  there  are  none,  it  performs  normally.  If  there  are 
loops  it  sets  a  flag,  MSTOPN.  This  will  stop  printing  numbers  and  start 
printing  dots  at  the  beginning  (destination  label)  of  the  outer  locp. 
It  also  sets  MSTOPD  to  stop  printing  dots  at  the  end  (jump)  of  that 
loop.  See  Fig.  13  and  Appendix  B. 


EXISTING  SUBROUTINE  LSTOUT  WRITES  OBJECT  CODE  FOR  LOADER  AND  SOURCE  CODE 
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Flowchart  for  Subroutine  LSTOUT  Modification 
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Flowchart  for  Subroutine  L3T0UT  Modification 
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The  subroutine  LSTCUT  has  been  modified  by  adding  the  COMMONS  INST 
and  LOOP.  A  variable  LPX  is  initialized  to  point  at  the  last  loop, 
because  it  limits  the  loop  index  used  to  find  the  next  outside  loop.  It 
checks  while  making  the  source  list  for  comment  lines  or  pseudo  op's. 
Nothing  extra  is  printed  on  these  lines,  because  they  have  no 
significance  for  the  LFPG.  The  node  number  is  printed  on  the  line  if 
the  executable  instruction  number  is  less  than  MSTOPM.  If  not  and  the 
line  is  part  of  a  loop,  a  dot  is  printed.  Use  it  is  the  backward  jump, 
for  an  outer  loop  so  the  node  number  will  be  printed,  and  the  flags  are 
reset.  Tnis  continues  until  the  last  executable  instruction  which  must 
always  be  a  node  by  definition. 

The  subroutine  FINDLP  has  been  added  to  find  loop  numbers  frcm  the 
loop  table  when  given  a  jump  number  passed  in  the  call.  See  Fig.  14  2nd 
Appendix  B.  This  is  necessary  to  provide  comprehensive  diagnostics  that 
reference  both  the  jump  and  the  loop  numbers. 

Tnis  completes  the  discussion  of  the  algorithms  implementing  the 
suitability  checker,  finding  nested  loops,  and  printing  the  LFPG.  7r.e 
next  chapter  discusses  the  interpretation  of  the  suitability 
determination  ar.d  the  diagnostics. 


Figure  14- 


Flowchart  for  Subroutine  FINDLP 


This  chapter  discusses  interpretation  of  the  suitability  factor, 
SF,  and  the  error  messages,  warnings,  and  notes  which  are  a  useful 
by-product  of  the  analysis. 

Suitability  Factor 

Although  SF  is  partly  an  empirical  factor,  it  can  be  theortically 
justified  as  discussed  by  Ramamoorthy  [1].  He  discussed  the  value  of  SF 
=0.3  as  being  a  useful  cutoff  point.  According  to  his  observation  and 
reasoning,  a  value  greater  than  0.3  indicates  that  a  program  has 
characteristics  favorable  for  parallel  processability.  Therefore,  the 
same  value  has  been  used  as  noted  in  the  output.  Further  research  may 
indicate  a  different  value  for  the  assembly  language  suitability  checker 
because  of  two  differences  between  this  application  and  Ramamoorthy ' s. 
These  differences  concern  loops. 

Since  Ramamoorthy  was  interested  in  dynamic  scheduling,  he  included 
backward  jumps  of  unknown  iteration  as  a  negative  contributing  factor. 
With  assembly  language,  the  predetermination  of  loop  iterations  cannot 
be  made  at  this  stage  of  analysis.  Therefore,  the  assumption  was  made 
that  all  the  loops  are  designed  for  a  predetermined  number  of 
iterations.  For  one  time  scheduling  of  the  program  in  the 


multi-microprocessor  system,  each  partition  will  be  scheduled  to  run  on 
a  processor  until  it  is  finished.  For  this  reason,  the  schedule  will  be 
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based  on  earliest  and  latest  possible  task  initiation  times  only;  not 
dynamically  based  on  how  long  the  partition  will  take  to  execute.  Cf 
course,  the  execution  time  must  be  reasonably  balanced  with  other 
partitions,  but  this  is  the  indication  given  by  the  suitability  factor. 
This  is  reasonable  for  nondynamic  scheduling,  because  the  final  solution 
will  eventually  require  some  balancing  or  fine  tuning.  Tnis  would 
involve  consideration  of  loops  of  indefinite  iterations  such  as  loop 
until  interrupt,  cased  on  the  overall  system,  a  decision  would  be  made 
whether  to  allocate  such  a  loop  as  a  discrete  task  or  part  of  a  larger 
set  of  tasks.  With  respect  to  nested  loops,  this  is  a  more  complex 
problem. 

Since  Ramamoorthy ' s  analysis  does  not  allow  nested  loops  and  does 
not  recognize  loops  of  unknown  duration,  it  really  does  net  consider 
loop  length  except  for  the  caveat  that  parallel  .paths  must  not  be  too 
long  [11].  This  is  checked  by  referencing  the  number  of  instructions  in 
loops.  Westing  effectively  increases  the  loop  length  as  shown  in  Fig. 
15.  Because  optimal  length  is  relative  to  the  length  of  other 
partitions,  it  is  not  possible  at  this  stage  of  analysis  to  judge  this. 
The  number  of  instructions  in  the  loop  is  only  a  rough  indication.  A 
short  loop  executed  many  times  could  run  longer  than  a  long  loop 
executed  only  a  few  times.  Therefore,  it  is  necessary  to  examine  not 
only  loop  length,  but  also  iterations  for  the  outer  loops  and  any  inner 
loops.  It  is  not  possible  at  this  stage  to  find  what  determines  the 
iteration  of  each  loop  and  how  many  times  it  will  execute  without 
additional  analysis.  Therefore,  the  formula  is  used  as  an  approximate 
value.  One  use  envisioned  for  the  multi-microprocessor  machine  is  to  do 
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experimental  program  executions  to  find  the  best  possible  schedule  or 
processor  allocation.  When  this  is  found,  the  program  would  be 
considered  ready  to  run  on  a  production  basis.  One  further  comment  is 
necessary. 

Although  it  is  not  highly  significant  here  to  study  the  bounds  on 
SF,  it  should  be  noted  that  it  is  usually  positive. .  But  it  cannot  be 
thought  of  as  a  positive  number  between  zero  and  one.  It  approaches 
infinity  as  the  number  of  instructions  in  loops  approaches  the  total 
number  of  executable  instructions.  This  is  an  undesirable  situation 
indicating  loops  that  are  too  long.  Further  research  will  be  helpful  in 
interpreting  this  factor  for  assembly  language  programs.  For  the 
remainder  of  this  chapter  all  FORTRAN  format  statement  labels  referenced 
are  shown  in  Appendix  8.  Tney  were  also  shown  in  Figs.  7  through  14. 


Possible  errors  noted  by  the  analysis  are  conditions  which  may  be 


due  to  faulty  logic  in  the  loop  or  jump  structure  of  the  source  program. 
They  could  cause  problems  in  execution.  They  are  explained  here  in 
order  of  the  format  label  number.  Statement  9001, "HO  EXIT  FROM  BACK 
JUMP,  UNLESS  8Y  RETURN  OR  OVERLAPPING  LOOP,"  means  an  endless  loop. 
Analysis  of  several  programs  showed  that  this  is  not  unusual  for 
microprocessor  systems,  because  they  are  often  designed  to  run  a  program 
over  and  over.  Therefore,  this  is  shown  as  a  possible  error.  Statement 
9902,  "NO  ENTRY  TO  LOOP,  UNLESS  BY  RETURN  FROM  JUMP  OUT,"  means  that  the 
loop  will  not  be  executed.  This  indicates  a  backward  jump  that  appears 
to  have  been  circumvented  by  a  previous  jump.  This  message  may  be 
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generated  erroneously  by  jumps  to  another  part  of  memory  that  has  a 
return  mechanism  that  is  not  discernable  by  the  analysis. 

Statement  9493,  "NO  PATH  TO  INSTRUCTION  EXCEPT  BY  CALL,"  means 
there  is  at  least  one  instruction  which  will  not  be  executed  because  of 
a  previous  unconditional  jump.  This  message  may  be  generated 
erroneously  by  some  programs  that  include  vectoring  or  some  sequence  of 
unconditional  jumps  with  no  apparent  return. 

Statement  9993,  "UNCONDITIONAL  FORWARD  JUMP  AROUND  FORWARD  JUMP," 
is  nearly  the  same  as  9493  and  in  some  cases  confirms  it.  9493  is 
generated  during  the  instruction  scan,  and  9993  is  generated  by  checking 
the  jump  table  to  confirm  that  there  is  no  apparent  path. 

Warninss-.Mada-  by_the.  Analysis 

Warnings  are  for  conditions  that  have  occurred  during  the  analysis 
that  will  cause  the  results  to  be  incorrect.  With  one  exception  these 
should  not  occur  unless  the  source  program  size  limits  have  been 
exceeded.  These  warnings  are  discussed  in  the  order  of  the  format 
statement  numbers. 

Statement  9000,  "LOOP  NUMBER  FOR  BACK  JUMP  NOT  FOUND,"  means  the 
loop  was  not  recognized  and  stored  in  the  loop  table.  This  error  should 
not  occur  unless  accompanied  by  9991  or  9996.  If  it  does  occur  alone, 
it  means  there  is  a  fault  in  the  analysis  program. 

Statement  9991,  "ARRAY  LP  OVERFILLED  MORE  THAN  100  LOOPS."  This  is 
self  explanatory  and  means  the  array  must  be  enlarged  to  accomodate  the 
subject  program. 
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Statement  9996,  "ARRAY  JP  OVERFILLED  MORE  THAI.1  200  JUMPS."  This  is 
self  explanatory  and  means  the  array  must  be  enlarged  to  accomodate  the 
subject  program. 

Statement  9980,  "LOOP  OVERLAPPED.  SF  VALUE  MAY  NOT  EE  MEANINGFUL." 
This  means  that  two  loops  are  overlapped.  A  loop  jumps  out  of  a 
subsequent  loop  which  jumps  back  into  the  former  one  as  was  shown  in 
Fig.  10.  Unless  these  two  loops  are  both  nested  in  a  third  loop  the 
LFPG  win  be  in  error.  The  analysis  could  not  accomodate  two  nodes 
which  partially  converge  or  are  not  discrete  and  distinct.  If  the 
overlapped  loops  are  nested  in  another,  all  three  will  be  treated  as  one 
task  for  partitioning,  and  the  overlap  will  be  inconsequential.  This 
version  of  the  program  does  not  make  this  determination,  so  all 
overlapped  loops  generate  the  warning. 

Statement  9999,  "UNCLASSIFIED  JUMP,"  means  the  jump  type  or  loop 
class  was  not  established.  It  indicates  an  array  problem  or  fundamental 
error  in  the  suitability  checker.  This  error  should  not  occur. 

Notes  Made  bv  the  Analysis 

Motes  are  for  conditions  discovered  in  the  source  program  that  are 
not  necessarily  wrong  but  considered  essential  to  emphasize.  They  may 
indicate  a  problem,  but  will  be  informative  in  any  case.  They  are  dis¬ 
cussed  in  order  of  the  format  statement  number  as  shown  in  Appendix  B. 

Statement  9^90,  "FWD  JUMP  IS  NOT  ASSOCIATED  WITH  ANY  LOOP,"  means 


an  isolated  forward  jimp.  These  are  detrimental  to  parallel  processing, 
especially  if  conditional. 
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Statement  9903,  "CONDITIONAL  FWD  JUMP  ENTERS  BACK  JUMP  FROM  OUTSIDE 
ITS  RANGE."  Tnis  could  be  an  error,  or  it  may  simply  be  a  way  of 
entering  a  loop. 

Statement  9990,  "JUMP  ALSO  JUMPED  AROUND  JUMP."  This  indicates 
that  a  forward  jump  went  around  another  jump  subsequent  to  the  one  shown 
in  the  jump  table.  It  is  not  necessarily  a  problem.  It  supplements  the 
jump  analysis,  because  the  association  can  only  be  stored  for  one  jump. 

Statement  9994,  "FWD  JUMP  AROUND  BACK  JUMP."  This  shows  a  jump 
around  a  loop.  It  may  indicate  a  problem,  if  there  is  no  other  path  to 
the  loop.  This  is  a  precautionary  note,  because  all  information  was  not 
available  at  the  time  to  make  an  unqualified  error  identification. 

Statement  9995,  "CONDITIONAL  PUD  JUMP  IN  BACK  JUMP,  BUT  IS  NOT 
SIGNIFICANT  TO  ITS  STRUCTURAL  CLASSIFICATION. "  Tnis  indicates  an 
interior  jump  has  been  found  that  is  not  a  minimum  requirement  of  loop 
structure  for  classification  purposes. 

Statement  9997,  "FORWARD  JUMP  ALSO  ASSOCIATED  WITH  JUMP."  Tnis 
indicates  a  jump  out  of  an  inner  loop  of  nested  loops.  According  to  the 
analysis  it  would  be  associated  with  all  loops  it  jumps  cut  of. 
However,  only  one  association  can  be  made  in  the  jump  table. 

Tnis  concludes  discussion  of  the  available  diagnostic  factors, 
errors,  warnings,  and  notes.  An  example  program  listing  showing  some  of 
them  is  included  in  Fig.  17  in  Chapter  V,  which  discusses  actual 
experimental  results. 


V.  EXPERIMENTAL  RESULTS 


This  chapter  shows  actual  results  achieved  by  using  the  suitability 
checker  on  ten  microprocessor  programs.  The  first  section  discusses 
these  results  and  the  second  section  explains  the  output  of  a  sample 
program  run. 

Findings 

Eight  actual  INTEL  8080  programs  and  two  pseudo  programs  (two  test 
cases  written  for  this  research)  were  analyzed  for  parallel  processing 
suitability.  The  results  of  applying  the  suitability  checker  to  these 
programs  are  shown  in  Fig  16.  Probably  the  most  significant  finding  was 
that  most  of  the  programs  were  suitable  except  those  containing 
overlapped  loops  which  were  not  nested  ir.  another  loop.  The  one 
program,  NONDIGIT,  which  was  unsuitable,  because  of  its  long  outside 
loop  structure,  was  found  to  be  highly  suitable  when  the  outer  loop  was 
removed.  In  one  other  case,  TEST  P2M11,  when  the  outer  loop  was 
removed,  this  exposed  overlapped  loops  rendering  the  program  unsuitable 
in  that  context.  Programs  containing  exposed  overlapped  loops  are  shown 
with  an  asterisk. 

Also  the  results  are  shown  in  parenthesis  for  an  earlier  version 
that  neglected  to  discount  nested  loops  from  the  loop  count.  Although 
this  did  make  a  difference,  it  was  almost  negligible  for  the  programs 
tested. 
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It  is  significant  that  the  checker  can  be  used  for  parts  of 
programs.  Although  assembler  errors  will  be  generated,  the  analysis 
will  still  be  completed.  The  next  section  tells  about  the  results  shewn 
on  the  program  listing. 

Explanation  of  a  Sample  Program  Output 

This  discussion  deals  with  the  listing  shown  in  Fig.  17.  The 
entire  figure  is  nine  pages.  The  format  of  this  output  is  the  jump 
analysis  followed  by  the  INTEL  8080  source  program  listing  and  symbol 
table.  The  jump  analysis  was  output  first,  because  it  reveals 
information  about  the  source  list.  The  source  list  must  be  referenced, 
however,  to  use  the  analysis.  This  reference  is  made  through  the  jump 
numbers  which  are  shown  in  the  jump  table  and  on  the  source  list  under 
the  column  label,  JUMP.  See  Fig.  17  (cont.)  on  the  fourth  page.  Both 
jump  numbers  and  loop  numbers  are  given  in  the  analysis. 

The  analysis  format  begins  with  the  errors,  notes,  and  warnings 
generated  by  the  loop  construct  analysis.  This  particular  example  had 
no  warnings,  mainly  because  there  were  no  overlapped  loops.  This  part 
is  followed  by  nesting  information  from  the  nest  check.  Then  the 
suitability  factor  is  shown,  which  is  0.35  in  this  example.  Tne  values 
of  variables  used  to  calculate  SF  are  shown  over  the  top  of  the  jump 
table.  See  Fig.  17. 

The  jump  table  format  is  the  same  as  explained  previously  for  array 
JP.  The  loop  numbers  for  backward  jumps  have  been  added  to  the  left 
side.  They  are  followed  from  left  to  right  by  the  jump  number,  its 
executable  instruction  number  (both  shown  on  the  source  list),  the 
executable  instruction  number  of  the  destination,  the  forward  jump  type 
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Figure  17  (continued) 
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or  loop  class,  and  associated  jump.  For  this  example  jump  1  goes  from 
executable  instruction  number  57  to  64.  It  is  a  type  6  which  is 
conditional  forward  as  confirmed  by  the  source  list  where  it  is  shewn  as 
JZ  ENDER.  This  jump  is  associated  with  jump  3  which  is  also  loop  1  at 
executable  instruction  63.  The  label  ENDER  is  shown  at  executable 
instruction  number  64.  Therefore,  jump  1  jumps  out  of  loop  1.  Also,  as 
the  analysis  shows,  jump  1  is  associated  with  jump  7  (loop  3),  because 
loop  1  is  nested  in  loop  3.  Therefore,  jump  1  is  part  of  both  loops, 
although  the  junp  table  only  shows  the  association  with  the  inner  loop, 
because  of  the  way  the  table  is  constructed.  At  the  bottom  of  the 
table,  notes  and  errors  derived  from  the  final  jump  table  analysis  are 
shown.  In  this  case  the  only  note  was  that  jump  12  is  an  isolated 
forward  jimp  which  is  not  conducive  to  parallel  processing.  Following 
this,  the  analysis  concludes  by  listing  the  loops  and  their  associated 
jumps.  This  is  the  same  information  as  the  jump  table,  but  it  is 
arranged  with  respect  to  loops  rather  than  jumps.  Also  note  that  jumps 
such  as  8  and  9  are  not  shown  as  associated  with  the  loop,  because  they 
are  not  significant  to  the  structural  classification  of  the  loop 
containing  them,  loop  4.  In  other  words,  they  have  no  influence  on 
program  flow  into  or  out  of  the  loop.  Therefore,  although  the  loop  list 
might  be  considered  redundant,  it  is  interpreted  in  a  slightly  different 
manner,  for  convenience  in  determining  loop  structures.  This  was  done 
for  possible  later  use  in  analyzing  loop  iterations.  It  would  be  easier 
to  decide  which  jumps  to  examine,  given  that  some  of  them  are  not 
structurally  significant.  Using  all  analysis  information  assists  in 
examining  the  source  list. 
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The  source  list  closely  resembles  an  ordinary  assembly  language 
listing  except  for  the  addition  of  the  columns  for  JUMP,  LFPG,  and  EX  NO 
in  the  middle  of  the  page.  This  example  listing  in  Fig.  17  was  slightly 
modified  to  fit  the  margins  by  deleting  the  line  numbers  which  would  be 
on  the  far  left  and  bringing  some  comments  from  the  far  right  to  the  far 
left.  Also,  it  should  be  noted  that  the  symbol  table  is  unchanged  from 
the  original.  See  Fig.  17  (cont.). 

In  designing  the  program  output,  consideration  was  given  to  using 
the  program  address  counter  or  line  numbers  for  analysis  reference 
rather  than  executable  instruction  numbers.  Although  the  executable 
instruction  numbers  had  to  be  generated  in  the  program,  they  were  easily 
added  and  are  much  simpler  to  use  and  reference  than  either  line  numbers 
or  program  addresses.  Line  numbers  are  not  definitive  enough,  because 
they  include  comments  and  pseudo  ops  that  have  no  bearing  on  the 
analysis.  Program  addresses  are  harder  to  work  with,  because  they  are 
hexadecimal.  Therefore,  executable  instruction  numbers  were  used  and 
shown  for  the  listing  to  clarify  the  analysis. 

The  final  product  of  the  analysis  is  the  nodes  of  the  LFPG  ’-'hich 
are  shown  on  the  source  list  between  jump  number  and  executable 
instruction  number.  These  numbers  can  be  considered  a  map  of  the  nodes 
of  the  LFPG  of  the  microprocessor  source  program.  The  only  part  missing 
from  the  graph  is  the  edges  or  arrows  between  nodes.  These  can  easily 
be  determined  by  looking  back  into  the  jump  table  where  they  are  shown 
in  the  FROM  and  TO  colunns.  B"  looking  at  the  LFPG  nodes  it  is  easy  to 
see  the  outer  loops  which  are  considered  as  a  single  node  or  task. 
These  loops  are  shown  by  dots  instead  of  numbers  to  indicate  a  series  of 
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Sample  Program  Output 


Figure  17  (continued) 
Sample  Program  Output 
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instructions  in  the  loop  considered  as  one  task.  The  node  denoting  the 
loop  is  the  number  corresponding  to  the  backward  jump  that  forms  the 
loop.  This  information  represented  by  the  LFPG  could  be  used  *s  input 
for  a  slightly  modified  parallel  task  recognizer  as  discussed 
previously. 

For  clarity  the  actual  LFPG  is  shown  in  Fig.  18.  To  be  used  with 
the  parallel  task  recognizer,  this  graph  would  be  used  with  task 
transition  information  to  construct  the  parallel  processable  task  graph 
as  explained  in  Chapter  II.  There  are  only  54  nodes  in  this  graph 
because  nodes  55  to  123  are  actually  parts  of  subroutines  as  shown  on 
the  output  list  in  Fig.  17.  Note  that  loops  four  and  five  are  contained 
in  these  subroutines.  Therefore,  they  occur  more  than  once.  This 
indicates  that  it  would  be  desirable  to  build  a  table  of  calls  and 
returns  to  cross  reference  with  the  jump  table  for  finding  loops  nested- 
due  to  calls.  This  would  also  improve  error  diagnostics  as  mentioned  in 
the  next  chapter. 

Other  specific  observations  relate  to  this  particular  program.  It 
points  out  the  fact  that  the  suitability  checker  is  limited  to 
recognizing  input  and  output  by  the  IN  and  OUT  instructions.  It  is 
obvious  that  this  program  has  a  good  balance  of  I/O  and  internal 
operations.  However,  the  I/O  is  done  by  subroutines  (in  the  read  only 
memory)  which  are  known  to  this  program  only  as  CALL  Cl  or  CALL  CO. 
This  would  appear  to  be  a  positive  factor  for  parallel  processing  in 
this  program  that  is  not  discernable  by  this  suitability  checker.  But 
there  are  negative  factors  that  are  subtle  also.  The  internal 
operations  are  too  closely  associated  with  the  I/O,  because  the  checking 
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NOTE:  All  node  numbers  refer  to  Fig.  1?  LFPG  numbers. 
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Contains  Former  Loop  3  by  call 
if  error  detected. 


FUNCTION 

Load  Stack  Pointer. 

Prompt  user  to  enter  HI  and 
LO  limits  for  memory  check. 
Receive  HI  and  LO  as  four 
bytes . 

Convert  LO  and  HI  to 
machine  language  memory 
addresses . 

Prompt  user  to  enter  NUM1  and 
NUM2  for  memory  check . 

Receive  the  numbers  as  two 
bytes  each. 


Convert  each  number  to  one 
byte . 

Display  header. 

Fill  memory  with  alternating 
NUM1  and  NUM2  from  LO  to  HI, 
check,  exchange  NUM1  and  NUM2, 
and  repeat . 


Figure  18 

Loop  Free  Graph  of  the  Sample  Program 


done  by  the  process  in  node  5 4  used  the  same  variable  names  for  HI  and 
LO  as  the  input  portion.  This  is  an  obvious  conflict  that  could  be 
eliminated  by  using  a  buffer,  if  the  programmer  had  been  thinking  in 
terms  of  parallel  processing. 

Although  it  is  necessary  to  use  the  parallel  task  recognizer  to 
find  the  optimum  partitions,  it  is  obvious  that  the  program  could  be 
simply  partitioned  between  nodes  53  and  54.  That  is,  one  processor 
could  do  the  I/O  and  preparation  while  another  checked  the  memory. 
There  is  a  possible  conflict  between  the  I/O  and  the  ERR  routine  which 
uses  the  MESS  and  CRLF  routines  as  shown  in  Fig.  17.  This  would  happen 
if  an  error  were  discovered,  which  would  direct  a  diagnostic  message. 
This  could  be  overcome  by  letting  the  memory  checking  processor 
interrupt  the  other  processor  to  perform  the  diagnostic  message  as 
discussed  in  Appendix  A. 

Another  point  to  make  is,  that  for  parallel  processing,  it  might 
have  been  better  for  the  initial  prompt  to  have  requested  all  necessary 
information  which  could  have  then  been  processed  in  parallel.  This 
clearly  shows,  that  the  programmer  cannot  be  disregarded.  Programs 
written  for  uniprocessors  will  necessarily  be  limited  in  parallelisms  by 
their  structure.  But  improvements  can  be  made  by  processing  portions  of 
these  programs  in  parallel.  However,  one  of  the  most  needed 
improvements  is  to  emphasize  the  need  to  program  for  parallelism  rather 
than  sequential  processing.  The  next  chapter  discusses  conclusions  and 


recommendations 


VI.  CONCLUSIONS  AND  RECOMMENDATIONS 

As  the  concluding  chapter,  this  discussion  will  emphasize  the 
significance  of  the  work,  general  findings,  suggested  complementary 
work,  and  use  of  the  multi-microprocessor  system. 

The  objective  of  this  thesis  was  to  incorporate  suitability 
checking  into  a  cross  assembler  as  a  step  toward  the  goal  of  assembler 
scheduled  parallel  processable  program  partitions  for  a  multi¬ 
microprocessor  system.  This  objective  was  achieved  by  solving  the  major 
problem  of  detecting  loops  in  the  microprocessor  assembly  language 
source  program  and  adapting  Ramamoorthy ' s  suitability  checker  for  use 
with  assembly  language.  The  additional  diagnostics  for  structural 
analysis  of  the  source  program  were  an  additional  benefit  incidental  to 
the  problem  solution. 

Significance  of  This  Work 

The  research  reprorted  within  has  important  implications  for 
further  research  and  for  debugging  existing  or  developing  programs.  The 
suitability  factor,  SF,  enables  an  easy  meaningful  estimate  of  potential 
for  parallel  processability  of  the  subject  assembly  language  source 
progran.  This  gives  ready  indication  of  which  programs  should  be 
subjected  to  further  refinement  such  that  they  may  be  efficiently 
executed  on  a  multi-microprocessor  system.  It  enables  finding  the  nodes 
of  Ramamoorthy ' s  reduced  progran  graph  or  loop  free  program  graph  (LFPG) 


70 


[13].  This  is  a  task  model  of  the  subject  program  where  all  outside 
loops  are  considered  as  a  single  task.  The  actual  graph  can  be 
constructed  from  information  available  on  the  output  listing  of  the 
analysis. 

The  analysis  of  an  assembly  language  source  program  does  not 
require  any  of  the  large  square  matrices  used  by  Ramamoorthy ' s  parallel 
task  recognizer  before  having  some  assurance  that  there  is  indeed 
potential  for  parallelism.  The  arrays  that  it  does  use  are  not  large 
relative  to  a  medium  sized  computer  system.  See  Appendix  C.  They  could 
be  compressed  by  superimposing  some  arrays  on  others,  but  the  saving  to 
be  realized  is  not  deemed  worthwhile  compared  to  the  increased 
complexity  that  would  be  required.  The  presented  version  actually 
combines  the  capability  of  a  suitability  checker  and  Phase  I  of 
Ramamoorthy' s  parallel  task  recognizer  [14].  The  information  from  the 
LFPG  could  be  combined  with  applicable  task  transition  information  and 
subjected  to  Ramamoorthy ' s  parallel  task  recognizer  to  obtain  the  task 
partitions  necessary  for  parallel  processing. 

Ramamoorthy ' s  suitability  checker  program  differs  from  this  one 
with  respect  to  loops  in  two  ways.  His  progran  did  not  allow  nested 
loops  and  did  not  recognize  loops  created  by  backward  jumps  or  branches. 
It  dealt  only  with  DO  loops  [11].  This  program  recognizes  three  classes 
of  FORTRAN-like  loops  in  the  assembly  language  program  as  well  as  nested 
loops.  It  finds  and  notes  overlapped  loops  which  may  be  detrimental  to 
parallel  processing.  The  ability  to  recognize  loops  will  aid  further 
research  in  the  area  of  determining  program  segments  which  may  be 
executed  in  parallel  on  a  multi-microprocessor  system. 


Diagnostic  messages  output  with  the  jump  analysis  and  source  list 
have  never  been  available  before  for  microprocessor  assembly  language 
programs  to  find  endless  loops,  find  loops  with  no  entry,  find  nesting 
errors,  and  analyze  program  flow  or  structure.  Although  the  diagnostics 
have  some  conditions  associated  with  them,  they  will  prove  generally 
useful  for  debugging  or  analyzing  programs.  Programs  that  use 
"vectoring"  will  generate  erroneous  error  messages,  because  this 
analysis  does  not  trace  the  vector.  It  would  be  possible  to  improve  the 
diagnostic  capability  by  building  a  table  of  calls  and  returns  for 
tracing  flow  vectors  and  giving  more  definitive  error  messages.  Error 
messages  for  loops  with  no  exit  must  be  regarded  with  program  design  in 
mind,  because  many  microprocessor  program  designs  purposely  include 
endless  loops  due  to  their  dedicated  nature.  These  diagnostics  actually 
tell  more  about  hew  the  program  is  structured  than  whether  it  is 
logically  correct.  But  they  do  provide  objective  automated  analysis, 
and  will  point  out  some  problems.  Using  the  analysis  has  already 
revealed  some  general  conclusions. 

General  Conclusions 

The  suitability  of  an  assembly  language  source  program  for  parallel 
processing  still  depends  on  the  programmer  to  a  large  extent.  This  was 
readily  shown  by  the  sample  program  in  Figs.  17  and  18.  Even  when  using 
implicit  parallelisms  the  programmer  cannot  be  disregarded.  The  same 
process  coded  in  different  ways  will  have  varying  potential  for  parallel 
processing.  Using  this  suitability  checker  could  help  develop 


guidelines  by  evaluating  different  approaches  and  selecting  the  best 
one.  As  a  software  design  tool,  this  suitability  checker  could  be  used 
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to  encourage  programmers  to  look  for  ways  to  facilitate  parallel 
processing  and  displace  the  habit  of  sequential  programming.  Some 
initial  guidelines  follow. 

Of  the  programs  analyzed,  those  which  had  the  best  suitability 
factors  had  simple  closed  loops,  often  with  no  exit.  This  supports 
Ramamoorthy ' s  findings  that  complex  decision  structures  are  not 
conducive  to  parallel  processing.  Conversely,  a  system  with  a  simple 
loop  could  be  set  up  to  use  one  processor  for  an  input/output  driver 
while  another  handled  interrupts  or  did  calculations. 

However,  research  showed  that  some  programs  which  had  a  poor 
suitability  rating  could  be  broken  up  into  subportions  which  showed  good 
potential  for  parallel  processing.  This  was  especially  true  for 
programs  which  were  written  as  one  large  loop  where  the  back  jump  was 
located  at  or  near  the  end,  and  the  return  was  at  or  near  the  beginning. 
If  the  program  is  one  large  loop,  the  SF  cannot  be  measured  except  below 
the  level  of  the  loop.  See  Fig.  16.  This  is  because  the  loop  is  one 
task,  and  the  analysis  will  yield  a  SF  approaching  infinity  as  NX  -  ML  = 
0.  Also  if  the  program  is  one  large  loop,  some  problems  of  parallel 
processing  are  masked.  Forward  jumps  not  really  associated  with  loops 
will  not  be  recognized  as  such,  because  they  are  contained  in  the  large 
loop.  This  suggests  guidance  for  using  loops  in  assembly  language 
programs. 

To  facilitate  parallel  processing,  programs  written  with  a  main 
loop  should  be  designed  to  balance  the  loop  with  other  tasks.  Put  only 
necessary  code  in  the  loop,  so  it  is  not  too  long.  Any  task  or  process 
that  can  be  put  in  a  subroutine  can  probably  be  handled  by  another 
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processor.  Also,  if  any  overlapped  loops  are  used,  try  to  nest  them  in 
another  loop.  More  guidance  of  this  type  could  be  gained  by  analyzing 
more  programs  using  the  suitability  checker.  Many  questions,  as 
outlined  below,  still  need  to  be  answered. 

Suggested  .Complementary  Work 

One  of  the  most  important  questions  remaining  to  be  answered  is  how 
to  implement  the  remainder  of  the  parallel  task  recognizer  for  assembly 
language  programs.  This  could  be  done  in  an  interim  pass  between 
assembler  passes  one  and  two.  It  should  not  be  necessary  to  change 
Ramamoorthy 's  recognizer  significantly  for  this.  The  most  difficult 
task  seems  to  be  that  of  automating  the  analysis  of  task  transitions 
necessary  to  develop  the  parallel  processable  task  graph  for  the 
assembly  language  program.  This  was  explained  in  Fig.  1 .  This  requires 
finding  the  links  between  each  task  and  the  next  task  that  uses  memory 
used  by  that  task  [12,14].  When  this  has  been  done  the  task  transition 
graph  and  the  loop  free  program  graph  could  be  used  with  Ramamoorthy ' s 
parallel  task  recognizer  [11]. 

The  next  question  remaining  to  be  answered  is  how  to  load  the 
parallel  processable  portions  of  the  program  into  separate  memories  with 
synchronization  primitives  as  aids  to  interprocessor  communication  and 
scheduling.  These  would  need  to  be  based  on  the  earliest  and  latest 
task  scheduling  times  derived  from  the  parallel  task  recognizer  for  each 
program  partition  [14].  The  assembler  could  append  these  primitives  to 
each  code  partition  and  configure  the  load  for  two  separate  memories  by 
using  two  program  counters.  Some  trial  and  error  balancing  might  be 
required  to  optimize  the  process  allocations  to  each  memory.  This  would 


be  important  in  the  development  of  a  multi-microprocessor  system.  The 
schedule  or  processor  assignment  must  be  tested  and  optimized  for 
production  execution. 


Scheduling  Considerations 

Lorin  described  a  streamlined  machine  with  specialized  individual 
processors  for  scheduling;  execution;  and  load,  store,  modify,  or 
transfer  operations  [10].  However,  such  a  system  would  require  breaking 
assembly  language  instructions  down  into  fragments  at  the  microprogram 
level.  Also  it  would  require  that  each  processor  have  the  capability  to 
directly  load  and  manipulate  registers  in  the  other  processors.  This 
might  be  advantageous  for  a  dedicated  design  requiring  very  high  speeds. 
But  it  would  require  detailed  analysis  and  many  design  compromises 
without  the  probable  return  on  investment  of  a  more  generalized  system 
using  standard  parts  and  having  more  possible  applications. 

A  simplified  version  of  a  dynamically  scheduled  system  could  be 
built  using  standard  processors  with  shared  memories.  This  was  the 
initial  attempt  at  solving  the  problem.  Initial  research  showed  that  no 
particular  machine  is  better  suited  than  any  other.  However,  the 
NATIONAL  SEMICONDUCTOR  family  of  IMP,  PACE,  and  SC/MP  machines  looked 
promising,  because  they  have  built  in  control  signals  allowing 
"daisy-chaining”  interrupt  type  communications.  Also  the  IMP  is 
microprogranmable  and  uses  the  same  instruction  set  as  the  PACE. 
Therefore,  an  investigation  of  the  scheduling  problem  was  made  for  the 
PACE,  so  that  the  results  could  be  applied  to  microprogrammable 
machines. 
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The  results  of  this  are  shown  in  Fig.  19.  Even  by  searching  on  a 
priority  basis  for  the  most  often  used  instructions  first,  it  will  take 
from  three  to  ten  instruction  cycles  to  recognize  the  instruction  and 
jump  to  a  routine  that  can  handle  it.  Even  using  a  more  sophisticated 
sort  would  take  two  or  three  cycles.  Therefore,  a  microprocessor 
operating  as  the  scheduling  unit  for  an  application  program  would  become 
hopelessly  bogged  down  as  it  tried  to  assign  individual  instructions  to 
the  other  two  processors  for  preparation  and  execution.  This  would 
create  a  serious  bottleneck  at  the  outset  without  even  considering  the 
problems  of  memory  contention,  processor  communicat;ien,  and 
synchronization.  After  this  initial  dead-end  in  the  research,  it  became 
apparent  that  a  different  approach  would  be  necessary. 

It  seemed  that  the  dynamic  approach  was  not  only  too  slow,  but  that 
there  was  too  much  overhead  at  execution  time  due  to  all  the  instruction 
fragmentation  and  transfers  to  different  processors.  Reflection  on  this 
problem  led  to  the  conclusion  that  if  all  this  overhead  could  be 
accomplished  beforehand,  operation  would  be  more  efficient.  The  only 
logical  time  to  do  it  was  before  the  load.  This  left  the  assembly 
process  as  the  only  possible  time  to  do  the  scheduling.  This  seemed 
compatible  with  Ramamoor thy' s  work  Cl,11 ,  because  it  would  be 
easier  to  schedule  and  synchronize  partitions  or  blocks  of  instructions 
with  less  overhead  than  it  would  for  fragments  of  individual 
instructions.  It  was  only  a  question  of  how  to  find  the  parallel 
processable  partitions,  their  execution  time  relationship,  and  the 
necessary  primitives  to  synchronize  them.  Therefore,  the  problem  inves- 
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tigation  proceeded  on  this  basis.  The  suggested  configuration  and  use 
of  this  system  will  be  the  last  topic  for  discussion. 


Configuration  and  Use  of  the 
Multi-microprocessor  System 

The  multi-microprocessor  system  should  be  a  generalized  design 
adaptable  to  dedicated  tasks  depending  on  the  application  program.  Two 
broad  system  classes  would  be  development  systems  and  production 
systems.  The  former  would  need  additional  capabilities  for  editing  and 
manipulating  memory  while  the  latter  would  be  more  specialized  according 
to  the  application.  The  associated  equipment  would  be  determined  by 
user  requirements,  but  both  systems  would  have  the  same  basic  design. 

This  generalized  system  should  have  two  or  more  processors  with  a 
common  read/write  memory  and  a  private  memory  for  each  [12].  The 
private  memories  would  be  read/write  memory  for  a  development  system  and 
read  only  memory  for  a  production  system.  Program  tasks  with 
synchronization  primitives  would  be  loaded  into  the  private  memories  by 
the  modified  assembler.  Processors  would  communicate  using  "mailboxes" 
(I/O  ports)  which  would  indicate  messages  in  common  memory  [2,4].  See 
Appendix  A.  The  individual  processors  would  not  need  to  be  highly 
specialized  unless  this  proved  beneficial  to  the  particular  application. 

Adapting  the  methods  used  here  to  any  particular  microprocessor 
language  should  not  be  difficult  for  someone  who  understands  the  subject 
machine,  its  op-codes,  its  instruction  set,  and  the  set  of  modifications 
made  here.  Although  much  work  remains  to  be  done,  systems  of  this  type 
are  both  feasible  and  useful.  They  can  be  developed  relatively 
inexpensively  for  either  research  or  commercial  application. 
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This  is  a  brief  discussion  of  why  interprocessor  communication  is 
necessary,  how  it  could  be  accomplished,  and  how  it  could  affect  machine 
design.  The  two  main  reasons  for  communication  between  processors  are 
resolution  of  conflicts  between  common  resources  such  as  shared  memory 
and  implementing  task  scheduling.  Once  the  processors  start  executing  a 
partition  (set  of  tasks)  they  must  follow  a  plan  for  transitioning  to 
subsequent  tasks  in  a  predetermined  fashion.  Scheduling  hueristics  have 
been  developed  by  Ramamoorthy  C 15J .  These  are  probably  applicable,  but 
that  discussion  is  beyond  the  scope  of  this  thesis.  The  task  execution 
will  also  generate  resource  conflicts  that  should  be  resolved  in  a 
systematic  way.  Djkstra,  Knuth,  and  Coffman  have  developed  efficient 
algorithms  for  scheduling  shared  resources  [11].  No  matter  what  system 
is  used  there  must  be  a  means  to  communicate  between  the  processors. 

It  would  probably  be  desirable  to  use  the  I/O  ports  of  each 
processor  as  "mailboxes1'  [2,4],  This  would  mean  that  I/O  would  need  to 
be  accomplished  by  memory  mapping  to  leave  the  ports  free.  Therefore, 
the  interprocessor  communication  would  hav^  priority  ever  I/O.  The  I/O 
could  be  designed  to  work  through  each  processor's  private  memory,  so 
the  shared  memory  would  not  be  involved  either. 

Interprocesscr  communication  through  mailbox  messages  has  been  used 
in  systems  such  as  MULTICS  and  also  in  smaller  machines.  It  is  an  easy 
way  to  quickly  indicate  that  one  processor  has  a  message  for  another  and 
imply  the  degree  of  urgency.  The  notification  requires  only  a  byte  or 
word  in  the  form  of  an  address  on  the  I/O  port.  The  message  itself  may 
be  much  longer  as  it  can  be  stored  in  the  common  memory.  The  message 
notification  may  be  accompanied  by  an  interrupt  signal  if  it  is 


important  enough  to  deserve  immediate  attention.  Otherwise,  the 
receiving  processor  may  be  set  up  to  check  for  messages  at  the  end  of  a 
task  or  timed  to  check  at  specified  intervals.  If  many  messages  are 
required,  the  mailbox  may  contain  only  a  notification,  handled  on  a 
schedule,  that  points  to  a  part  of  the  common  memory  where  the  messages 
are  stored  and  prioritized.  In  this  way  the  receiving  processor  can 
handle  the  messages  as  its  schedule  permits.  But  it  will  be  able  to 
accept  a  larger  number  of  messages  than  would  be  possible  if  it  waited 
to  handle  each  individual  message  in  the  mailbox  as  it  arrived.  The  way 
these  messages  are  used  would  determine  the  system  design  to  some 
extent. 

After  the  program  partitions  have  been  scheduled  and  loaded  with 
the  proper  pointers,  the  partitions  would  execute  and  point  when  done  to 
the  next  partition.  This  could  be  done  at  least  two  ways.  That  is,  the 
completed  partition  could  return  to  an  operating  system  or  simply 
transfer  control  directly  to  the  next  process.  This  design  decision 
'would  depend  on  the  desired  level  of  sophistication.  By  pointing  to  the 
next  process  directly  it  would  seem  possible  to  execute  faster  with  less 
overhead.  The  proper  synchronization  primitives  could  be  added  to  each 
partition  by  the  loader,  so  that  the  task  could  not  start  until  it  was 
allowed  to.  Each  partition  would  set  its  successor's  primitives  when 
finished.  On  the  other  hand,  this  function  could  be  performed  by  the 
operating  system  by  updating  a  table  which  would  be  checked  by  each 
partition  before  starting.  If  the  process  was  not  allowed  to  start,  the 
operating  system  could  retain  control  for  more  flexibility  rather  than 
simply  idling  the  delayed  processor.  But  in  a  non-dynamic  scheduling 
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situation  for  a  dedicated  system,  this  degree  of  sophistication  is 
probably  not  necessary.  Tasks  could  communicate  directly  with  each 
other  with  little  overhead. 
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.  AO-A092  216  *IR  FORCE  INST  OF  TECH  WRIBHT-PATTERSON  *FB  OH  F/«  9/9 

CONSIDERATIONS  FOR  AN  ASSEMBLER  SCHEDULED  MULTI-mICROPROCESSOR  — CtC(U) 
*06  80  r  L  STEWART 

UNCLASSIFIED  *FIT-CI-flO-41T  ML 


MICROCOPY  RESOLUTION  TEST  CHART 
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_  in-i.  i 


*A0  P>i-liSC 
*1  *=2*11. *.6 

C  VERSION  2  15  APR  30  -  PEnFORMS  SUITABILITY  ANALYSIS 

C  FQ*  P***AllEL  PPJJESSSNi  • NO 

C  JUMP  ANALYSIS  (30Th  FG-<  „NT£l 

C  5050  SOURCE  CGOE  ON.*). 

*i  02411.65 

CCMMON/LCOP/LPMAX.LPdjC  )  .LPNC.LGjES 

CO MM ON /INST  /  JPX  (  100  0, 2)  ,  JP  (2GG  ,*+)  ,  1STOP,.  •  ST  V=D,  ^FPG 
*1  P2WH.71 

DIMENSION  lPUu  0)  •JPJES(2C.') 

*1  P2**l  1 . 110 

data  jp.lp, jpdes/.oo*c,i:o*o ,2] Id/ 

DATA  JPX  /2;:i*0/ 

*:  F 2**  11. 125 

L 1 NENM=C 
*3  P24 11.160 

GC  TC  500  .  ' 

*3  °2 411,16  7 

GC  TC  500 
*C  F2 41 1,173 

■j  O  T  C  5  0  0 
*J  P 2411 , 18  I 

GO  TC  50  0 
*1  02411.114 

c 

C  ZERO  VARIABLES  FOR  SUITABILITY  ECUATC"1 

C 

C  SUITABILITY  FACTOR 

SF  =  Ci .  0 

C  INPUT  /  OUTPUT  INSTRUCTIONS 

MO=G 

C  ARITHMETIC  /  LOGICAl  INSTRUCTIONS 

NAl  =  C 
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C  CAllS 

N  C  =  0 

C  JUMPS  BACKHANDS  <LuO°S> 

K6»0 

C  JUMPS  rJxWft’-fJ  (NOT  RELATED  Tu  A  lOuP) 

NPs  C 

C  EXECUTABLE  InSTnuCTIGNS  (NX  ALSO  JSED  TO  J£-£  MINE  UlI 

kx=  : 

0  1  f  STr‘ UCTI OnS  WITnlN  lCCPS 

NL  =  0 

C 

C  .NXTiAuIZE  JUMP  TAS-E  InCEX,  NJ 

N  J=1 
C 

C  JU.U-  TAiLt  CONTAINS  t  < 1 ) JUMP  ..JO^ESS*  II)  J;ji.  ifcul.'i 

C  ACQPEoS,  (3 )  JUMP  TTPc,  AND  (U)I'iLEx  (NJ>  je 

C  ASSOCIATED  LOOP 

C 

c  jump  types  spei  ..oof  cl~->s~.>  a~.e: 

C  I  CJ'-OI  TI J  N  A  *_  gaCKWA^D  1  S  .  .  I 

C  c  O'iCG  NOIT  I  ChA  t  SACKwAPO  c  1 TO- >  £0*  A  TE 

C  6  C  JUG  *  T 1 0  N  Al  EOR^AkO  3  0:,,c^c'» 

C  6  UNCONDITIONAL  FO-«WAr.O 

c 

c 

C  WrxTt  HEADS  FO-  GUTaUT  CAGE 

w 

ficlTE  ( 5  } » b  0  3  4) 

6134  FORMAT  (1H1,UX,“— -INTEL  6Jiv  JUMP  AnAw»3IS - vS*  " 

1  "2.  JO,  IS  AP ft  40 - *V) 

•I  P 24 1 1 . 1h 7 

CALL  SmSSCH  (lABEl, KPTM 
JPOES  (<PT  •>)  8  NX  + 1 
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■:C  =  \2*i 
*0  3?**i  I  ,  1  97 

;u  GO  TO f35u, 15,16. ia,?3. 19. 2C. 19. ZC, 19. 21. 2l.Ci.il .ICO" 
*:  °?i*ii.cc7 

CALL  S^SPCH  (LA3£L. «pT  =  I 
JP"I£S  <  «®Tn  SNX  +  1 
*1  o;uii,217 

2  £  '-t  F  2  K  SUITA  TIL!  tv  pop  PARALLEL  PPCCES2  l,v  DETERMINING 

r  I'JSTSL'CTTM  TYP'F  AmO  USING  A  -0?*ULA  t 3  2h£C<. 

2  A  =  r  TMMPTIC  LOGICAL  INSTRUCTION? 

r .:  :=■<•<•.  gt.:)  gc  to  ei: 

T^r^'CE.' 0.2.0®. ICCnS. £2.  11*1  GC  T  0  -3; 

!«rd/'  CD-2. IT. ?7.  A*IC.  (MOD  CI200E,  9»  .£0.2)  )  S'*  rr  5r* 

GC  Tc  5  =  ,- 

n 

2  2  2'J'iT  A-IT^NrTI2  ,-  =  LOGICAL  INSTRUCTION 

~  ■>  a  L  =  *;.*.  L  *  1 
2  ^  TO  -=  ' 

v 

2  ~AL-  r  =  JUM0? 

=  1'  :"<«•.'  -  GC  T  2  5^-’ 

Ic  <«0nrCC.2r,?l  ,LT.«I  SC  T  2  =  20 
**  rM2I  T  I  2.‘!  AL  gill'’ 

Ip  («nr  (  T2C0E ,51  .NE.5I  GO  TO  C1C 
•iC  =  n2*1 
GC  TO  51= 

51;  MFsfiF  ♦! 

=  •  =  g  o  r  n  =  =  ■ 

*?*  IC(I2 rOE.LT ,1941  GO  TO  *52 
r> 

■J 

r  JL'M  =  S  -  C *r  0*  7*0?  I,  0 J I L 0  TAPLt 
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C  Flf-.ST  PI.*0  :?  JEST  ;s  DEFINED  (I.E.  F*C  D-  iAC<) 

CALL  SMSRCH  CCFl »  < JE  ST ) 

C 

IF  HCOD£«f-E  .  1:. 5  )  NO  TO  527 

C  UNCONDITIONAL.  if  ewj,  DON'T  KnG*  OES1  hD0>£0L-  >ET. 

IF (LOOS (<CEST ) . ED.-l)  GO  TO  525 
C  DESTINATION  IMSTxUCTICN  no.  OF  JU.10  E-l* 

jP(NJ,2)=JP0Ei(KD£ST) 

c  jur'P  class 

JP  ( N  J »  3)  =2 
J  F  (NJ.4I =NJ 
GC  TC  529 

C  UNCCr.ClTICN.iL  JUNP  FO-wA-.D 
525  JF(NJ,2) s-<jEST 
JF(NJ,3)  =  - 
00  TO  530 

C  Cf MOITI JNAl.  IF  fnO,  OON'T  KNCN  OEST  ADDRESS  > ET  . 

52  7  ;F(LGCS(i<OEST)  .ED.-l)  GO  TO  525 
C  LESTI  NATION  INST-uCUCN  NC  .  Cc  JCMP  -jAC< 

JF  (NJ,2)=J£»OES(KOtST> 

c  jump  class 

JF  (NJ,  3)  =: 

J  F  ( N  J  »  A )  = J 
oO  TC  52  3 

C  JUMF  FJ*W£30 

r  2d  JP(NJ,2>=-<OEST 

JFU'J,  3)  *c 
GC  TO  53G 
C 

C  lCUNT  A  (.COP  AND  IMG'EHEM  LOOP  T^3lE  0 0.  .TE 

C 

529  in c*NijAl 


C 
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c  s*ve  jump  t a  3lE  index  cf  ^oop 
c 

LF (we ) =N  J 

c 

C  JUMP  TA5LE  OVERFLOW 

C 

53 C  CONTINUE 

IF  (Ns.GT.lOO)  WRITE  (50,5391) 

9991  FORMAT  < IX , “♦♦♦ wa^n ING.  ARRAY  l?  JVERFI ulE 0.  " 

1  "MORE  TmAN  1 0  j  LOOPS."/) 

:f<nj.6T.2o:>  w-ite  (so.-^ei 

5996  FORMAT  ( 1  X**>  ♦♦  wA  rc  .«iN  3«  ARRAY  JF  3 V E \FI l lEO  .  '  ,  -  E  ThAn" 

1  "  200  JUMPS."/)  • 

C 

C  JUMP  address 

c 

JF ( N J» 1) =NX  +1 

L* 

C  INCREMENT  JUMP  TA3t.E  POINTER 

C 

JF  X  (LINENM,!)  =  N  J 
N J*NJ +1 
GO  TC  550 
C 

C  i/o? 

c 

5-C  IF(K.NE.3>  GO  TC  55 J 

IF(lCC0E.-Q.2ll.w>R.ivCuE.EQ.213)  NlO  — 

I F ( MCO ( ICOD E , 6 ) .EQ.6)  GO  TC  5k3 

C 

C  COUNT  EXECUTASut  INSTRUCTIONS 

C 

55  3  NXsNXn 

JFX(LiNEN.' ,2)  =  IX 
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uC  TO  1 
C 

C  CHECK  LOOP  CONST -.JCTS  AT  EhO  LF  PASS  Cf.E 

C 

C  SAVE  HQM  i-iA  it  V  JUMPS  (NJrtA.X) 

C 

55*  CCNTif.'UE 
r  JMAX=fiJ-i 

c 

C  SAVE  NUMBER  OF  ljC-S 

C 

lpmax  =  NJ 

C 

z  FIND  is X  OF  F<*0  JUMP  DESTi^T-C.IS  T 0  USE 

C  IM  CALC'JLAT  icns  fc^  nl  later 

c 

C  CHECK  iu  JUMPS 

C 

CC  557  JC  =  1,‘4JMA< 

C  -LkEAOT  <NO*  NX 

IF ( JF  (JC ,CI  .GT. u )  GO  TO  557 
C  IF  NOT,  CHECK  itu  SVM30LS 

OC  555  JS=I »SM 3PT- 

C  IS  THIS  JUMP  A 3 SCC  WITH  THIS  SYieOL? 

IF  (JF  (  JCt  SI  .EQ.-JSI  JP  ( JC«c)  =  J=>3ES  ( J3  ) 

555  CONTINUE 
557  Cl  NT I nu£ 
r 

J 

C  iVXN  LOOP  OF  TmIS  PORTION  *NOtXSS  JUNP  si.*0  .hEIK 

C 

CO  590  MC=1»NJ1AX 
h.  J*w  J-MC 

C  HAS  THIS  JUmP  3EEN  ASSOCIATED  WITh  4  ? 

IFUMMJ.U)  .GT.C  .AMO.  Jh(mJ,3)  .ST.  21  TO  5:. 
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C  iHANCH  ACC0KuI4G  TO  JUMP  T  yo£ 

IF  <JF  .GT.G.ANw.  JPMJ,  3)  .lT.3)  GC  TC  Z- i 

IF(JP  (.MJ.3)  .t-J.6)  cO  TO  576 
iFCJP(MJ,3>  .EJ.3)  GO  TO  576 

C  IF  IT  WAS  NOT  ONE  OF  THE  A£C\/E»  Shlw  a;,  £i\-0  •  *5. 

w*  ITE  (51  »99  '  -)  MJ 

39A9  FCKMA1  (IX  .“♦♦♦WAGING.  UNCuAail- If J  J  U  f  *V  ) 

C  Ofc'TEFt KlNt  LOOP  CLhSS  I,  2,  u.-  3 

C 

C  This  LOOP  Checks  FORWARD  JU.IHS  ( f.Fl >  Tu  ASiJC.ATf 

C  THE-  WITH  THE  BACKWARD  JU^P  (hj).  Tj  BE  A  E  SuC  *  A  TE  C 

C  THE  FORwA^O  JUMP  MUST  STA*.T  if  A  _JOP. 

C 

562  CONTINUE 

C  *MIIAlIZ£  cNT\Y  (<FE>  A  40  E»1T  (KFj>  f.»i.  T.  4 UN 

<FE=G 
KFC=*. 

MFi*r j-i 

CC  57 h  NC Is 1 1 VF  1 
mfi=t j-NCi 

C  IS  IT  A  FwJ  JUMP? 

X  F  ( JF  (,NF  1 » 3  I  «NE  •  6.  A  NO  .  JF  IN f  I »  i  )  .  Nc  •  5  )  C*C  t  ^  -7„ 

C  YES.  DUES  IT  ORIGINATE  IN  THE  ta0CF? 

IF  (JF  (;,Fl,l  )  .GE.  JP  (  1 J  .  2)  •  A  NO  •  Jp  ( NF  i  *  1 )  .lT  .  J-  <  J.  ;  ) 
1  iO  TC  5b** 

C 

C  NC.  COES  IT  JUMP  IN? 

c 

IF ( JP (NFl,2» .uT . JP(MJ,2» .Ok. JO( (Pl,2)  .GT.  JP«  !J,1) ) 
1  GO  TO  574 
CA„U  FINDuP(MJ» 

WF  ITE  (50,9hG3)  (*Fl,MJ,LPNG 
GO  TC  574 


C 


C  L'CES  ORIGINATE  IN  THE  LOOP.  UES  iT  Ji;M-  0J7? 

C 

564  CONTINUE 

IF<JP(NPi,2» .GT . JP(MJ,i)  )  GO  TO  5o5 
C  NOTE  x  NS  I GN  x  FI C  A  nT  INTERIOR  FWJ  JO;1®« 

CALL  FINOlP(MJ) 

IF  tjPlMJ,3>  .E3.L)  WRITE  (50,5905;  \'Fl ,  NJ  , 

GC  TO  572 

C  YES.  JUMP  OUT.  ENT  *  Y  IF  COl.Oi  T  ION«L. 

565  C  CNT I NUc 

IF  (JF (nFi,3) .EI.6>  KF E  =  i 

c  £  fit  hay  exist  since  there  '  io  a  jimp  cur. 

KFu=l 


C  CHECK  all  PREvICUS  Fwc  JUMPS  10  lETEF  EG. 

c 

NCAisYiFl-i 
GC  57C  NC16=1,NCA1 
NftaNFi-NCio 
C  IS  IT  FW 0 ? 

I  F  ( JF  (N  6 »  3 )  •  t'«  E  .  6  .  A  N  0 .  JP  ( N6 »  3 )  .  N  E  .  r  )  GC  TO  5  ^ 

C  INSIDE  JUNP  AROUND? 

IF (JP (Nb, 2) .LT. JP(\ j, 2) )  GO  Tu  57. 

IF  ( JF  <N6, 1)  .  ST.  JP  (rJFl  ,1)  ,CR.  J?  (  4fc  , 2)  .  „7 ,  J  P  (  .F  i ,  i ) 

1  •  CP  »  JP  ( No  ,  2 )  •  G T •  JP  (  H  J .  1)  »GR  .  jp  ( 4  Jf  2  )  •  G T  •  J  ;  ('.>?,  i  )  J 

2  GO  TO  5 7 J 

x r  ( JP  (Nf.  i  il  .  jT,  j  )  WRITE ( 5w  .99  57)  N 6.1  A 6 *  (  J 3  C c  ,  >*  I  I 
JP(Nf  .4)  =  JP  (  M  J.  4  I 

591 3  FChH.AT  (IX,”  NOTE.  CONO  FWO  JUMP"!-."  EnTE-S  3->K  JU-.P* 
1  14”  (  x  CO  P  "i  4 M)  FnLH  CUTSI0E  ITS  ~AnGE.*V> 

C  CU*FEnT  JUiP  TYPE  F RON  DC  57^? 

IF  CJf  (Nri»3).EQ.5)  GO  TO  5t6 
C  TYPE  6.  Ij,  jUria  A^CUNO  A  TYPE  6? 

I F ( JP ( N6 »  3 >  . E Q. c  >  GO  TC  57„ 


94 

99C2  r  C°MAT  (lx*****  POSSIBLE  EE-C*.  M  EjT~Y  T,  u-.’-  "  "14 

1  "  (9ACK  JUN®  ”14"  >  UNLESS  ;Y  f ETU.-iL  .'.IT  uCu^N  " 

2  "FF>  CM  JUMP  CUT."/) 

<FE  =  1 

C 

c  complex  Class  3  lolp? 
c 

IP(JF (KJ,3)  .NE.i)  JP(MJ,3)=3 
GO  TC  57 j 
5  66  C  (.  NTI  r;UE 
C 

C  TVPE  6.  IS  JUMP  A-.CUMO  A  TYPE  6? 

C 

IF (JP  <  N  6 »  2 )  .  £  Q  •  6  )  <Fj  =  0 
5 7 j  CONTINUE 

lALl  FIMClP(MJ) 

I F  ( JP  (M  J »  3 )  .EQ.2.AN0.KFE.NE.1)  WkITE  <  >  .  3  9  i  > 

KF£=C 

572  CONTINUE 

IF ( JP (NF1  ,4)  .ME.: .AnO. JP (NF1.4)  .NE.JP (f  J,  -) ) 

1  WrvITE  (51  »95  97)  hrl»iA2S(JPtN'Fl»-*)) 

J ° ( ;.F i )  =  J °  (VJ  i.) 

: 7  +  CONTINUE 

CALL  FInQlP (4J) 

le  ( JP  ( fl  J  »  3)  •  r  G#  2  •  A  \  j .  KF  0  •  i<  E  •  I )  Mt.Te  ( t  *  -1  ■„  .  )  ’>  o  .  L  P  *  v 

GO  T 0  5 9 j 

J995  Fl^MAT(1X."NJT£.  COM  rWQ  JUMP  J.  .  <  jUiP  ” 

1  14"  <lOOP"I4”  ),  BJT  IS  NUT  SIGNIFICANT  T 1  -Ts  " 

2  "ST^UOTU^Al  CLASSIFICATION.”/) 

9997  FCKMAT  (IX, "NOTE .  F*0  JUMF"i4”  ALSO  ACSM  ^IT  h  JUI  F" 

1  I-"."/) 

C 

C  THIS  LUOP  CHECKS  AlL  SUBSEQUENT  L  rj°S  <  <F  *  >  -  A-lNST 

C  The  Fwo  JU1P  <MJ)  PCUNy  IN  Ou  5iC  TO  SEE  . f  :-J  JLVPS 
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C  ABOUND  AN*  OF  T Ht  OTHER  LOCri  (‘(Fa)* 

C 

'j 

576  CONTINUE 
NC6L  =  NJ.-IAX-mJ 

00  577  .(C i= 1 »NC  6l 
ii  F  3  =  N  J—iN  C  3 

C  IS  NFS  4  BACKWARD  JUMP? 

IF  (  JF  ( NF(j  ,3  >  *  GT  .3)  GO  TO  577 
C  YES.  DOES  MJ  JUMP  ARCUNO  LOOP  NFb? 

IF  ( JF  (MJ*  2)  ,u£.  J3<NF3  ,  1)  .Cft.  JP(.iJ»  1)  .  01  .  JF  (  F - ,0  >  ) 

1  GO  TO  577 
C 

C  0.0  IT  JIMP  AROUmO  A  FWj  JUMP  A. SO? 

C 

if  (  ,j  .  i.amo.  jp  (-jp  i-j,*)  »3)  .  :t. 

i  w**IT£  (5  •1*39911  M  J »  -  J  P  ( M  j  ,  4  ) 

C  LL  FIN0.P(MF3) 

W-ITE15J*.  j  ’<*>  MJ.NF-.LPNC 

13  54  FORMAT (1X*MNCT£.  FWO  JUMP"I4M  AROU  <0  i-ACK  Jj  >  " 

1  I**”  (uo  CP"^“  )**/) 

C  ASSOCIATE  WITH  -i_GOP 

JF ( M  J  * 4) =-NF3 

577  CONTINUE 
f.FU=M  J-l 

C  TrlS  LOO3  CHECKS  ONLY  PREVIOUS  FrtO  JUMPS  i  l.'J!  h.^InST 

c  the  fwo  jump  (mj)  to  see  if  there  is  t  ju-c  --junc  mj. 

DO  564  NCP=1.NFU 
NCU=MJ-NCS 

C  IS  NCU  A  FWO  JUMP? 

IF ( JP (NCU  * 3 ) •  lT  #4)  GS  TO  534 
C  YES*  OOES  SOU  JUMP  A  ROUND  MJ? 

IF(JF (NCU*2> *LE. JP(  ij,l) )  GO  TJ  664 
C  YES. 


ALREAlY  AiSCC  WITH  OThEP  F WO  JU1PS? 


*F  tJP  (^oUl  -♦)  a  lT  ■  l  i  M-iOt  (  Jp  {Tiullf  ♦)  )  •  N  Cl  •  0  i 

1  NCU,-JP  (  NCU*4  ) 

1  WRITE  <5a,939fJ)  NCJ,-JP  {tiCU.to) 
lSSCC  JUMP  r.CU  AiTr  -MJ  TC  SnCW  JU  IP  /w'JNO. 

999a  FORMAT  (IX, "NOTE.  JUwf  "Ito"  AlSJ  JUnPEC  AOJ-.r  " 

1  "JUMP"!  A‘*.  "/) 

JMNCU,4)  =  -H  J 

NCU  CONOI  TICNAt.  ? 

if ( JF  ( NCU  ,3  >  •  E3  •  t )  GO  TO  5 6* 

W*ITE <5u ,9993)  NCU.NF* 

5:4  CONTINUE 

9993  FORMAT  (IX,  ”***  =GSSl3LE  £  P  R  0  *  .  Ui.CO.ND  F«J  Jj  P  “ 

1  lu”  AaCUnD  F  WO  JU-tP  ”  I  *•”,”/ ) 

IF  FwQ  JUMP  (MJ)  IS  MCT  A5SGC,  ASiOC  IT  «.fn  ITSELF 
I F ( JF ( M J, to) , EQ, C )  JP(MJ,4)=-MJ 

9 }  l>  1  FORMAT  dx,”***  POSSIswE  ERROR.  NO  EXIT  F-Z-'.  -ACk.  ” 


*  •«*  -  UCT  *  cN 


1  "JUMP”!-,"  ( C  OCP”Ito”)  UNLESS  9 V 

2  "Cf«  OVE  <L*j  P°ED  LOOP.”/) 

ENO  (•  AIM  LOUP  TO  CHECK  JUMPS 

<  CONTINUE 


THIS  LOOP  CHECKS  THE  FINAL  JUMP  T  A3tE  TO  - 1  m  j  ^.Ch 
LOOPS  A  aE  NESTEO  ANO  TO  SEE  IF  ANY  ARE  E A  f  ?E  C 

WHICH  WCULJ  CAUSE  ERRORS  IN  Th£  JJM3  ANAwYS.j, 
L^GES=LPM AX 

Flag  outermost  loop  pc*,  listing 
LP(LCCEi)  =  -^P(uOOES) 


OC  5  8  6  <CiJ=l*LPMAX 

LH£CK  THROUGH  wtoJP  TAawE  FACKWARlS 
KC2 ( L  PMA X ♦! ) -K3N 


f  > 
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G  xf-  FI  -  ST  uJCF  £*.  F  •.  G  .3  *  f'j  n£ED  Tj  ;-!u  “.GST  . 

lMKC.E9.il  oC  Tj  567 

C  ° :EVI CUS  uOOP  NESTED  in  P-ESEnT  lOJP  (<C)P 

IFCJFUIA  :S(l3(<C))  >.2).»E.JPUP(<C-i)  ,_)  )  ^  TO  > afc 

1 F ( KC » £Q • l 0 JES )  30  T G)  565 

C  PREVIOUS  lOOF  NESTED  IN  GUTE.E.-jST  wCuP  l_  IDES'? 

IF  ( JF  {  ( I A  -jS  (  L  =  ( L.  DDES)  I)  ,  2)  ,«.£  .  J3  t_3(  <C-i)  .  c)  > 

1  WHTF  (51,9962)  KC-1,UCD£S 
C  PREVIOUS  lGOF  uVE-. LAsPED  niTn  P-.ESE  IT  LJ^  (  K<.  i  ? 

3  6  6  IF  (JP  (LP  ( KC-i )  ,1)  .GE.JF  (l°  (KC)  ,2)  .ANu 

1  . JF (LP  <  KG )  ,2)  .  3T.  JP (LP (KC-1 1  ,2  )  I 

2  W.-ITE  (50,;-:.)  KC-1,KC 

i  3  9  »  F  CP.  HAT  (1X,**+  +  *WA*N  xwG.  LOOP**  l4‘*  DVE^l^-^ED  .  T H  LSGP 

1  if,  Sr  VAuUE  -AY  NOT  3E  Nt- ‘1j.«^FUw  .  '-V  ..CT" 

2  M  EE  ACGJP  ATE  .  **/  I 
C-  C  T  C  5,7 

665  **ITE  <5w,96;2)  KC- 1 »  <C 

3992  FC  P»VAT  (IX,  "“.CT  E  •  lODF"Ia"  NESTED  IN  lCIP'-IM) 

.11  AT  IS  wOOES? 

5-7  IF  (JF  (LP  (KC)  ,2)  .  IE.  J3(  (lAr-SUP  (LODES)  I  )  ,2)  )  TO  56  c 
C  .NOTE  CUTE-MOST  loop  In  TAdLE 

lCOES  =  KI 

IF  (LCOES. E9.LP.rAX)  Sl  TG  53c 
LF(LCHES)  =  -lP(wCOES) 

*  •  oLNTiJ.UE 

C 
C 

g  this  loop  *,i£l<s  t-ie  fin»w  j:j-p  t-ble  t. 

C  FIND  The  TJTAl  m  L  6  d  E  x  GF  FCMA^D  J'Jlt  S 

C  f, CT  ASSOCIATED  *ITH  LuOPS 

C 

JC  591  1CJ=1,'J’TAX 
IF  (JP(ICJt-)  .LE.O)  IF  =!»F  ♦  1 


3  ‘  1  CONTI  CUE 
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;  C.OJNT  I*JiT-?UCTIC  o  I  I  LOJFS 

1 .  S  T  L  =  u 

Ol  S'-.z  n  C I  •  T  =  I  * i.  P  '•!  A  X 

IF((_f  ( NC N 1  t  tGEtU)  GO  T U  594 

INSTL  =  J?  (  IJ  £S  (  Jr  (NCNT  >  )  ,  1)  -  JP  (iAdS  (u- <ii:  T)  )  ,  i) 

NL  =  NL+I'ISTl  +  1 
GG  TO  5  95 
5  9-+  CONTINUE 

MB  =  N8  -  1 
5  95  CONTINUE 

IC  =  PLCAT  (I.IC  ) 

*■  L=F  L  C»T  I  NA  «. ) 

UC=FCOAT  (NC) 

9J=FLCAT  <i,o ) 

FJ=FwOAT(.iF) 

E X  =Fl  CAT  <  >.X  ) 
p|_=FLCAT  I’  Ll 

■FUKX-NJ.l-.J)  GO  TG  5952 
EF=<<IC+4L+UC+3J-FJ)/(EX-Pl>  )  -  l^L/E.v  ) 

W-ITt  (53*9l«iI)  3F 

t'j.L  FijkMAT  (”  SuITAsIlITY  FACT  Or  F02  pa-ULlEl  l  jZZCZ-IsG 

1  ”  IS  “F  7.2".  A  \Y  VAlUE  O^EATE"  T,m,  1.3  IS  FA  Vo-" 

2  "AStE."/) 

<-  C  T  C  5  - 1 

-952  W-1TE  <5J.9:iJ) 

-i'.ti  F  C  -  *■'  A  T  (!/" - lOGH  STRUCTURE  IS  NJT  CLIT-E.E  r\-  “ 

1  “Fp.rAl.wEc  P*. C  vE S  SI  nu  AT  TriS  ..  c.  V  E«.  •  “/ ) 

5  96  CONTINUE 


;  PUNT  THE  JUMP  TA3i_E 

* 

6f>b6  h  a  IT  E  (5j«  5  999) 

5  99  2  F  CROAT  <  1  >  ,  T5  :>  »  “  J'JM  P  TAEwE’VX 

1  IX  tT  46 » “NJii  AX“2  x“M  J”2x'*f; Al'*cX”i.C”3X "N  3  x“‘.t  ”3  *. 


99 


2  ”  I L ** 3 x”N.\”) 

*  ~  IT  E  (5^*63 *3)  ‘(JMAX  f  Ni  0»  i\«i.  i  t  ■<  3|ii  *  t  »l  t 
6Qu3  FORMAT  (1X»T46»c  ll4»X>/) 
kc  ITE  (5w  i6J  w^) 

6cj5  F  CRMA  T  ( 1 X  »  T45”L  GOP  JUMP”*  IX  ,  “FROM"*  2  X  «  “T  0  “«  2X  » *’T  Y  c€  " 


1  IX’VSSOL”) 

JNX=0 

OG  72  Cm3  I0=1»NJMAX 

wFITE  (5j,63!:>  1 0 «  (JP(IOtlV)  ♦  I V  =  I » > 
b2i'3  FC\MAT  (lX,T£0,i3»Xt-(I«,X)/) 

IFdO.EJ.  JPiID,->  j  JNX  =  jnX41 
:F(I0.£'J.JPU:,4)I  WRITE  <50,G015>  JNx 
0G15  A T  (IX  ,T^5,  13«2X,25H*******»****************») 

7 C  C  u  CCNT.  MJc 
C 

C  P>-'INT  THE  JUMP  ANALYSIS 


W  F  ITE  (5  u  *  44  •}  j  I 

i+*C  FGP-AT  ( IX,  T-+7,  ”  JUHO  ANALYSIS”/) 

C  INDEX  THr.oU*jH  JUMP  TAolE 

DC  597  I C  =  1  » N JM  A x 
IF(jp(lC,o)  .LT.&)  GO  TG  597 

C  CrECK  fu«  » NY  Ci tCUMVENTEO  INSTRUCTIONS  (  J  (CO?  U  VU!*8I 
IF  (JF  <IC»3>  ..Nt.  3)  GO  TO  5964 
C  LOOK  FuR  PkEYIJUS  FORWARD  JUhp  TO  UST-UITIGN 

I  F  TEF  U'iCC NO*  T*  CNAt  JUMP 

ICL=IC-1 

OC  6962  nc  =  1,  ICl 
I GCH  ~  XC  ~  i 

IF(JF  (IDCN,  2)  .EQ.  (JP<  ICt  1)  ♦!)  I  JO  TJ  =9-5* 

3st2  CONTINUE 

C  u OOK  FJX  luJF  3 AC<  TG  INSTRUCTION  A F T £ ^  JL.'.P 

I CN=I C  »i 

DC  6.’t3  I  00  =  *CN  ♦  *  JM  -»X 


I  :c  ( Jp  (IUC.2)  .EQ.  ( JP  CC.il+l)  )  GO  TO  5  9r:o 


tj96 3  CONTINUE 

C  It  NC  PA  T  h  wAS  POUND*  WRITE  E-ROn  MESSAGE 

NCPATh  =  JP ( i C  *  1 ) +1 
wflTE  (5-*3<+93)  NOPA1H 

9h93  FORMAT (IX ,T16,"***  POSSIBLE  fc  -  R9-.  No  PATH  T  .  " 

1  "InSKUC  Tj.CNi“j.*"UNil£SS  BY  CAll."/) 

5  96**  IF  (  JP  (IC,4>  .\E.  u  .AND.  JP(IC,4>  .Nt.-IC)  oO  TO  ~'-7 
wrITE  (50*9490)  i.C*J=(lC*l) 

9490  FORMAT  ( 1  < ,  I  3  0  ,  "Nu  T  E .  FWO  JUMP  "±3",  Ex  ,0."Io 

1  "  IS  NOT  ASSOCIATED  .-(ITh  A.,y  ;_COP."/> 

5-7  CONTINUE 


w-iTE(5C,9?  :.) 

95  C  0  FCRVAT <9X. "LoOPS  In  This  PROGRAM"/ 

1  IX,"  luOP  #  "  *  r  X  *  "F  S.  0  M  " ,  6  X  ,  "  TO  ",  6 X , “0 _ - ' S "/ ) 

DC  599  IL*1«lP'1A* 

H'lTE  (SO*  .520  I.,  JP(iAJS(L»(  j'U  )  » 1>  ,  JPC  -  .  jS  ( -P  (__  > ) 
1  ,2)  *  JF  (  IAES(LF  (ID  )  ,3) 

9  r  2  D  FORMAT  (3x,l4,4*6x»l4,4,6x,  I4,l,':,x,I3) 
w iTE  (  5 1  *  95  3  j I  I  _ 

353S  F  okMA  T  ( cX  *  T  5  0  * "r  •O'xW  ARC  JJi  PS  ASSOCIATE-  w  I  7  - 
1  "LCoP  " , I  3 / ) 

KFF=0 

DC  59c  IC=l*NJMAx 

IF(JP(IC*4)  .-*£.  LP(I  L>  >  Go  TD  3  99 
IF(JP(IC,3)  .GE.l.ANC. JP(IC,3>  .lE.3)  GC  TD  39s 
W'IT£(E3*p<*9c)  IC,JP(iC,l> 

■3496  FORMAT  (IX  ,  T  r  I » "  JUMP”!  3”  Ex  I N  S  T  9  NO. 

KFF=1 


NL  To  22 
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•l  “2411.23? 

C  w  I T 1  A  Li  Z  ?  TO  jHJW  LOOP  F  *  E  E  P  \0  u  ■*.  4  "t  !•  !■.»?«  (  _  r  P  G I 

Lr'PC-  =  J 
KSTGFn  =  \X 
C  ANY  LC0PS  ? 

1 F  <  wPf'AX  .  £Q .  j  >  Gu  TO  23 
C  YES.  ►tSTPPE  LOOP  T A3lE  ENTRY 

LP(LOOES)  =  -uptwOO£3) 

C  SET  FLAG  To  STOP  NUliEKS  AT  DESTINATION 
N  S  T  C  P  N  =  Jp (LP(LOOES) .2) 

G  SET  FlAG  TO  STOP  .  AT  JUNP 

«STCPG=  JP  Uc(l DOES)  .1) 

P2-U.5l« 

1  3lf.Vtr.  2.03,  lS  AP-\  ;  j  Pwt'E  ,X3//> 

*1  P2411.E10 

SUBROUTINE  Fi:.'OL=(I) 

C 

C  T  »-IS  rv 0 U T  i  >1 1,  r  1  f i  j S  T  -i£  LOOP  U ."i 2 E r.  ( »_ 5 „■ )  '  ~ 

G  A  8AC  <  JU‘p  (I)  rC-.  EFE^ENCE  PJl'bo3E  2  j  3  X  -  3  3  T » 

w* 

CCMNCN  /L  GO*/  LF -iAX  ,LP  Ucw  >  ,LP  n.LOOES 
LPNO*L 

DP  ICC  UC  'jT*1  ,u  F  'iAX 
ip(I.EQ«cP(LCnT)>  GO  Tu  3lu 
103  CONTINUE 

*  F  ( Lr  NO.  EG.  v  )  <4ftITE(?L»*«!CC)  I 

PCkMAT  (IX, "♦♦♦WARNING.  oCGP  NO.  FOR  RACK 
1  ”  NOT  P’u  UN  0.  **/) 

303  LPNU=LCNT 

*  E  TUP  iN 
ENG 

*1  P2411.721 

COMNCn/InST/JkX  (1302,2)  ,  jP(2Ew,^J  ,rtSTc*N,  -sSTOPO^F 


1 
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C  r  ;  /l  V  -'/L'-liA.LMi:.  .  ,U  Jl£  i 

*:  P _-l  1 « 72  2 

L  P  A  =  Lf5  ' »  A  -  i 

*0  P„-*i:.7«.C,P2-11.7<*l 

c 

0  S r C W  NODE  S  Or  _  jZ^  r-EE  c  ->r-*i 

C  (Lfcjl  jsI  kiSTING 

q 

0  LCMmE.'.T? 

*  in  r  I T  F  <5. <6-321)  JPX  (  l  I  ■<£  Nr*  *  2  ) 

oc2l  FC®  ;AT  ( 1  l  X  » **E<  .J.  =”!*/) 

-F  (  JFX  Il1E\v,.2)  .£Q.u)  i>C  TO  jj; 
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PROGRAM  SPECIFICATIONS  OF  THE 
MODIFIED  ASSEMBLER 
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This  is  a  brief  discussion  of  the  size  of  the  program  and  hew  much 
time  it  takes  to  run.  This  modified  assembler  requires  approximately 
34K  to  load  in  a  CDC  6600  and  54K  to  execute  with  its  associated  system 
routines.  The  compile  time  is  approximately  8.7  seconds,  but  it  could 
be  loaded  from  disk  in  a  fraction  of  that  time.  The  execution  time 
depends  on  source  program  length  to  an  extent,  but  mainly  on  source 
program  complexity.  Execution  times  of  0.645  central  processor  (CP) 
seconds  to  9.1  CP  seconds  have  been  noted.  These  were  for  approximately 
85  and  1200  program  lines  respectively  including  instructions  and 
comments.  However,  a  test  program  of  approximately  120  lines  that  had 
many  junps  took  8.2  CP  seconds  to  execute.  All  these  nunbers  depend 
also  on  the  host  assembler  to  some  extent.  The  host  used  is  a  fairly 
sophisticated  large  program  that  can  assemble  either  INTEL  or  MOTOROLA 
source  code.  To  modify  the  assembler,  arrays  totaling  approximately 
3100  words  were  required  to  be  added.  The  present  version  can 
accomodate  source  programs  with  up  to  200  junps,  100  loops,  and  1000 
lines  of  instructions  and  comments.  These  arrays  can  be  easily  adjusted 
to  smaller  or  larger  source  programs. 
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APPENDIX  D 

INTEL  8080  OP-CODE  GROUPS 
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For  INTEL  8080  assembly  language,  checking  op-codes  to  count 
instructions  is  easier  than  checking  mnemonics.  Mnemonics  would  require 
checking  character  by  character.  The  result  obtained  would  require 
checking  against  a  table  or  some  standard  to  decide  which  ones  to  count 
(NAL,  NC,  NIO,  or  NJ) .  But  op-codes  are  made  available  in  pass  one  by 
this  assembler  for  checking  assembler  directives  (pseudo  op-codes). 
Since  pseudo  ops  are  of  no  concern  for  this  analysis  of  instruction 
types,  only  valid  instructions  are  checked. 

Instruction  op-codes  are  grouped  in  a  way  that  allows  easy 
instruction  identification.  Identification  is  made  by  using  the  last 
digit  of  the  op-code  and  the  INTEL  instruction  type.  The  eight  INTEL 
instruction  types  (called  K  in  the  assembler)  are  based  on  the 
references  made  and  instruction  length.  These  are  used  to  determine  the 
instructions  in  each  of  three  main  op-code  groups.  These  octal  groups 
are  conveniently  divided  as  0  to  177,  200  to  277,  and  300  to  376.  In 
the  first  group,  all  instructions  for  which  K  is  three  are  arithmetic  or 
logical,  except  if  the  last  digit  (modulo  8)  is  two,  or  if  the  op-code 
is  0  or  166.  In  the  second  group,  all  are  arithmetic  or  logical.  In 
the  third  group,  if  K  equals  four,  the  instructions  are  either  junps  or 
calls.  But  if  K  equals  three,  they  are  1/0  unless  the  last  digit  is  six 
which  indicates  arithmetic  or  logical  immediate  instructions.  This  is 
why  it  is  so  easy  to  determine  the  type  and  count  the  numbers  of  each. 
If  the  op-code  does  not  fall  into  one  of  these  groups,  it  is  simply 
counted  as  an  executable  instruction. 


